重建控制文件丢失undo异常恢复—ORA-01173模拟与恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:重建控制文件丢失undo异常恢复—ORA-01173模拟与恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库异常关闭,使用resetlogs方式重建控制文件,不包含undo表空间相关数据库,然后尝试resetlogs打开数据库,非常容易出现ORA-01173: data dictionary indicates missing data file from system tablespace之类的错误,本文通过模拟该错误,并简单提供解决方法
模拟ORA-01173错误

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> shutdown abort;
ORACLE instance shut down.
--除掉undo表空间相关数据库然后使用resetlogs模式重建控制文件
SQL> startup nomount;
ORACLE instance started.
Total System Global Area  175775744 bytes
Fixed Size                  1343668 bytes
Variable Size             117444428 bytes
Database Buffers           50331648 bytes
Redo Buffers                6656000 bytes
SQL> @/tmp/ctl.sql
Control file created.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/ora11g/system01.dbf'
SQL> recover database until cancel;
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done
SQL> recover database until cancel using backup controlfile;
ORA-00279: change 12696930512547 generated at 08/26/2013 14:43:13 needed for
thread 1
ORA-00289: suggestion : /u01/oracle/oradata/ora11g/archivelog/1_8_824475918.dbf
ORA-00280: change 12696930512547 for thread 1 is in sequence #8
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel;
ORA-00308: cannot open archived log 'cancel;'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/ora11g/system01.dbf'
ORA-01112: media recovery not started
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/ora11g/system01.dbf'
SQL> recover database until cancel using backup controlfile;
ORA-00279: change 12696930512547 generated at 08/26/2013 14:43:13 needed for
thread 1
ORA-00289: suggestion : /u01/oracle/oradata/ora11g/archivelog/1_8_824475918.dbf
ORA-00280: change 12696930512547 for thread 1 is in sequence #8
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/oracle/oradata/ora11g/redo02.log
Log applied.
Media recovery complete.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-01173: data dictionary indicates missing data file from system tablespace
Process ID: 15053
Session ID: 125 Serial number: 3

alert日志

Mon Aug 26 15:22:20 2013
alter database open resetlogs
RESETLOGS after complete recovery through change 12696930514503
Archived Log entry 1 added for thread 1 sequence 7 ID 0xfb91eb36 dest 1:
Archived Log entry 2 added for thread 1 sequence 8 ID 0xfb91eb36 dest 1:
Archived Log entry 3 added for thread 1 sequence 6 ID 0xfb91eb36 dest 1:
Clearing online redo logfile 1 /u01/oracle/oradata/ora11g/redo01.log
Clearing online log 1 of thread 1 sequence number 7
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 /u01/oracle/oradata/ora11g/redo02.log
Clearing online log 2 of thread 1 sequence number 8
Clearing online redo logfile 2 complete
Clearing online redo logfile 3 /u01/oracle/oradata/ora11g/redo03.log
Clearing online log 3 of thread 1 sequence number 6
Clearing online redo logfile 3 complete
Resetting resetlogs activation ID 4220644150 (0xfb91eb36)
Online log /u01/oracle/oradata/ora11g/redo01.log: Thread 1 Group 1 was previously cleared
Online log /u01/oracle/oradata/ora11g/redo02.log: Thread 1 Group 2 was previously cleared
Online log /u01/oracle/oradata/ora11g/redo03.log: Thread 1 Group 3 was previously cleared
Mon Aug 26 15:22:26 2013
Setting recovery target incarnation to 2
Using SCN growth rate of 16384 per second
Mon Aug 26 15:22:26 2013
Assigning activation ID 4220652407 (0xfb920b77)
LGWR: STARTING ARCH PROCESSES
Mon Aug 26 15:22:26 2013
ARC0 started with pid=20, OS id=15131
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Mon Aug 26 15:22:27 2013
ARC1 started with pid=21, OS id=15135
Mon Aug 26 15:22:27 2013
ARC2 started with pid=22, OS id=15139
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /u01/oracle/oradata/ora11g/redo01.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Aug 26 15:22:27 2013
SMON: enabling cache recovery
Mon Aug 26 15:22:27 2013
ARC3 started with pid=23, OS id=15143
ARC1: Archival started
ARC2: Archival started
ARC2: Becoming the 'no FAL' ARCH
ARC2: Becoming the 'no SRL' ARCH
ARC1: Becoming the heartbeat ARCH
Errors in file /u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_15053.trc:
ORA-01173: data dictionary indicates missing data file from system tablespace
Errors in file /u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_15053.trc:
ORA-01173: data dictionary indicates missing data file from system tablespace
Error 1173 happened during db open, shutting down database
USER (ospid: 15053): terminating the instance due to error 1173
Instance terminated by USER, pid = 15053
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (15053) as a result of ORA-1092
Mon Aug 26 15:22:29 2013
ORA-1092 : opitsk aborting process

解决方案
解决方案设置相关异常回滚段,然后屏蔽掉,数据库即可正常起来

--参数文件修改(使用event或者工具,系统命令等)
undo_management='manual'
_corrupted_rollback_segments
--重启数据库使得_corrupted_rollback_segments生效
shutdown immediate;
startup
--尝试open数据库
recover database;
alter database open;
--新建undo
create undo tablespace undo_new datafile '' size 100m autoextend on next 10m maxsize 30G;
--删除老undo
drop tablespace old_undo
--修改参数
shutdonw immediate
undo_management='auto'
undo_tablespace='unod_new'
--启动数据库
startup
--逻辑导出数据,导入新库

姊妹篇:重建控制文件丢失undo异常恢复—ORA-600[25025]模拟与恢复

重建控制文件丢失undo异常恢复—ORA-600 25025模拟与恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:重建控制文件丢失undo异常恢复—ORA-600 25025模拟与恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在现实的情况中,有些人因为某种原因重建控制文件(丢失undo[有意或者无意]),然后又resetlogs库尝试恢复,这样的操作可能导致各种比较麻烦的恢复,这里模拟ORA-600[25025]异常恢复
模拟ORA-600[25025]错误

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> alter database backup controlfile to trace as '/tmp/ctl';
Database altered.
SQL> create table chf.t_xifenfei_www as select * from dba_objects where 1=0;
Table created.
SQL> insert into chf.t_xifenfei_www select * from dba_objects;
74749 rows created.
--另外一个会话abort
SQL> shutdown abort;
ORACLE instance shut down.
SQL> STARTUP NOMOUNT
ORACLE instance started.
Total System Global Area  175775744 bytes
Fixed Size                  1343668 bytes
Variable Size             117444428 bytes
Database Buffers           50331648 bytes
Redo Buffers                6656000 bytes
SQL> !vi /tmp/ctl.sql
CREATE CONTROLFILE REUSE DATABASE "ORA11G" NORESETLOGS  ARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 '/u01/oracle/oradata/ora11g/redo01.log'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 '/u01/oracle/oradata/ora11g/redo02.log'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 '/u01/oracle/oradata/ora11g/redo03.log'  SIZE 50M BLOCKSIZE 512
DATAFILE
  '/u01/oracle/oradata/ora11g/system01.dbf',
  '/u01/oracle/oradata/ora11g/sysaux01.dbf',
 -- '/u01/oracle/oradata/ora11g/undo02.dbf',
  '/u01/oracle/oradata/ora11g/users01.dbf',
  '/u01/oracle/oradata/ora11g/dbfs01.dbf',
  '/u01/oracle/oradata/ora11g/tts_xifenfei02.dbf',
  '/u01/oracle/oradata/ora11g/tts_xifenfei01.dbf',
  '/u01/oracle/oradata/ora11g/system02.dbf',
  '/u01/oracle/oradata/ora11g/czum01.dbf',
  '/u01/oracle/oradata/ora11g/undotbs02.dbf',
  '/u01/oracle/oradata/sp2008',
  '/u01/oracle/oradata/sp_2009',
  '/u01/oracle/oradata/sp_2010',
  '/u01/oracle/oradata/sp_2011',
  '/u01/oracle/oradata/sp_2012',
  '/u01/oracle/oradata/sp_2013',
  '/u01/oracle/oradata/sp_2014',
  '/u01/oracle/oradata/sp_2015',
  '/u01/oracle/oradata/sp_2016',
  '/u01/oracle/oradata/sp_2017',
  '/u01/oracle/oradata/sp_2018',
  '/u01/oracle/oradata/sp_2019',
  '/u01/oracle/oradata/sp_2020',
  '/u01/oracle/oradata/sp_2021',
  '/u01/oracle/oradata/sp_2022',
  '/u01/oracle/oradata/sp_2023',
  '/u01/oracle/oradata/sp_2024',
  '/u01/oracle/oradata/sp_2025',
  '/u01/oracle/oradata/sp_20max'
CHARACTER SET ZHS16GBK
;
"/tmp/ctl.sql" [New] 43L, 1519C written
SQL> @/tmp/ctl.sql
Control file created.
SQL> recover database using backup controlfile until cancel;
ORA-00279: change 12696930343864 generated at 05/18/2013 01:17:54 needed for
thread 1
ORA-00289: suggestion :
/u01/oracle/oradata/ora11g/archivelog/1_38_805394597.dbf
ORA-00280: change 12696930343864 for thread 1 is in sequence #38
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/oracle/oradata/ora11g/redo01.log
ORA-00310: archived log contains sequence 37; sequence 38 required
ORA-00334: archived log: '/u01/oracle/oradata/ora11g/redo01.log'
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/ora11g/system01.dbf'
SQL>  recover database using backup controlfile until cancel;
ORA-00279: change 12696930343864 generated at 05/18/2013 01:17:54 needed for
thread 1
ORA-00289: suggestion :
/u01/oracle/oradata/ora11g/archivelog/1_38_805394597.dbf
ORA-00280: change 12696930343864 for thread 1 is in sequence #38
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/oracle/oradata/ora11g/redo03.log
ORA-00310: archived log contains sequence 39; sequence 38 required
ORA-00334: archived log: '/u01/oracle/oradata/ora11g/redo03.log'
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/ora11g/system01.dbf'
SQL> recover database using backup controlfile until cancel;
ORA-00279: change 12696930343864 generated at 05/18/2013 01:17:54 needed for
thread 1
ORA-00289: suggestion :
/u01/oracle/oradata/ora11g/archivelog/1_38_805394597.dbf
ORA-00280: change 12696930343864 for thread 1 is in sequence #38
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/oracle/oradata/ora11g/redo02.log
ORA-00279: change 12696930370956 generated at 08/26/2013 13:00:25 needed for
thread 1
ORA-00289: suggestion :
/u01/oracle/oradata/ora11g/archivelog/1_39_805394597.dbf
ORA-00280: change 12696930370956 for thread 1 is in sequence #39
ORA-00278: log file '/u01/oracle/oradata/ora11g/redo02.log' no longer needed
for this recovery
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/oracle/oradata/ora11g/redo03.log
Log applied.
Media recovery complete.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [25025], [3], [], [], [], [], [],
[], [], [], [], []
Process ID: 12603
Session ID: 125 Serial number: 3

alert日志信息

ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile until cancel  ...
ALTER DATABASE RECOVER    LOGFILE '/u01/oracle/oradata/ora11g/redo02.log'
Media Recovery Log /u01/oracle/oradata/ora11g/redo02.log
Mon Aug 26 13:05:00 2013
ORA-279 signalled during: ALTER DATABASE RECOVER    LOGFILE '/u01/oracle/oradata/ora11g/redo02.log'  ...
Mon Aug 26 13:05:10 2013
ALTER DATABASE RECOVER    LOGFILE '/u01/oracle/oradata/ora11g/redo03.log'
Media Recovery Log /u01/oracle/oradata/ora11g/redo03.log
Mon Aug 26 13:05:10 2013
Incomplete recovery applied all redo ever generated.
Recovery completed through change 12696930370973 time 08/26/2013 13:00:56
Media Recovery Complete (ora11g)
Completed: ALTER DATABASE RECOVER    LOGFILE '/u01/oracle/oradata/ora11g/redo03.log'
alter database open resetlogs
RESETLOGS after complete recovery through change 12696930370973
Archived Log entry 1 added for thread 1 sequence 37 ID 0xfa6fa6cb dest 1:
Archived Log entry 2 added for thread 1 sequence 38 ID 0xfa6fa6cb dest 1:
Archived Log entry 3 added for thread 1 sequence 39 ID 0xfa6fa6cb dest 1:
Clearing online redo logfile 1 /u01/oracle/oradata/ora11g/redo01.log
Clearing online log 1 of thread 1 sequence number 37
Mon Aug 26 13:05:22 2013
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 /u01/oracle/oradata/ora11g/redo02.log
Clearing online log 2 of thread 1 sequence number 38
Clearing online redo logfile 2 complete
Clearing online redo logfile 3 /u01/oracle/oradata/ora11g/redo03.log
Clearing online log 3 of thread 1 sequence number 39
Clearing online redo logfile 3 complete
Resetting resetlogs activation ID 4201621195 (0xfa6fa6cb)
Online log /u01/oracle/oradata/ora11g/redo01.log: Thread 1 Group 1 was previously cleared
Online log /u01/oracle/oradata/ora11g/redo02.log: Thread 1 Group 2 was previously cleared
Online log /u01/oracle/oradata/ora11g/redo03.log: Thread 1 Group 3 was previously cleared
Mon Aug 26 13:05:33 2013
Setting recovery target incarnation to 2
Mon Aug 26 13:05:33 2013
Using SCN growth rate of 16384 per second
Mon Aug 26 13:05:33 2013
Assigning activation ID 4220644150 (0xfb91eb36)
LGWR: STARTING ARCH PROCESSES
Mon Aug 26 13:05:33 2013
ARC0 started with pid=20, OS id=12679
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Mon Aug 26 13:05:35 2013
ARC1 started with pid=21, OS id=12683
Mon Aug 26 13:05:35 2013
ARC2 started with pid=22, OS id=12687
Mon Aug 26 13:05:36 2013
ARC3 started with pid=24, OS id=12691
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /u01/oracle/oradata/ora11g/redo01.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Aug 26 13:05:36 2013
SMON: enabling cache recovery
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file /u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_12603.trc  (incident=146705):
ORA-00600: internal error code, arguments: [25025], [3], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/oracle/diag/rdbms/ora11g/ora11g/incident/incdir_146705/ora11g_ora_12603_i146705.trc
Mon Aug 26 13:05:45 2013
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Undo initialization errored: err:600 serial:0 start:57601994 end:57610584 diff:8590 (85 seconds)
Errors in file /u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_12603.trc:
ORA-00600: internal error code, arguments: [25025], [3], [], [], [], [], [], [], [], [], [], []
Errors in file /u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_12603.trc:
ORA-00600: internal error code, arguments: [25025], [3], [], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 12603): terminating the instance due to error 600
Instance terminated by USER, pid = 12603
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (12603) as a result of ORA-1092
Mon Aug 26 13:05:47 2013
ORA-1092 : opitsk aborting process

trace文件

*** 2013-08-26 13:05:38.945
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=7j16t46cacjt9) -----
alter database open resetlogs
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+41        call     kgdsdst()            BF8A34E4 ? 2 ?
ksedst1()+77         call     skdstdst()           BF8A34E4 ? 0 ? 1 ? 8592C38 ?
                                                   8592AB6 ? 8592C38 ?
ksedst()+33          call     ksedst1()            0 ? 1 ?
dbkedDefDump()+2704  call     ksedst()             0 ? BF8A40E7 ? 47EF1FF4 ?
                                                   BF8A3D66 ? 0 ? BF8A3BF4 ?
ksedmp()+47          call     dbkedDefDump()       3 ? 2 ?
ksfdmp()+59          call     ksedmp()             3EB ? BF8A5458 ? F1DADED ?
                                                   116056E0 ? 3EB ? 116056E0 ?
dbgexPhaseII()+1725  call     00000000             116056E0 ? 3EB ?
dbgexProcessError()  call     dbgexPhaseII()       B6C515AC ? B6A0C890 ?
+2550                                              BF8A8D30 ?
dbgeExecuteForError  call     dbgexProcessError()  B6C515AC ? B6A0C890 ? 1 ? 0 ?
()+65                                              B6C515AC ? B6A0C890 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  B6C515AC ? B6A0C890 ? 0 ? 1 ?
1794                          ()                   0 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   116056E0 ? B6C330D4 ? 258 ?
50
kgeade()+324         call     00000000             116056E0 ? B6C330D4 ? 258 ?
kgeriv_int()+107     call     kgeade()             116056E0 ? 11605808 ?
                                                   B6C330D4 ? 258 ? 0 ? 61C1 ?
kgeriv()+25          call     kgeriv_int()         116056E0 ? B6C330D4 ? 61C1 ?
                                                   0 ? 1 ? BF8A96B8 ?
kgesiv()+98          call     kgeriv()             116056E0 ? B6C330D4 ? 61C1 ?
                                                   1 ? BF8A96B8 ?
ksesic1()+48         call     kgesiv()             116056E0 ? B6C330D4 ? 61C1 ?
                                                   1 ? BF8A96B8 ? 61C1 ? 1 ?krta2rfn()+78
                                                   call     ksesic1()            61C1 ? 0 ? 3 ? 0 ?
kttsinfo()+496       call     krta2rfn()           3 ? 0 ? 0 ? 0 ? 0 ? 0 ?
ktusmout_online_ut(  call     kttsinfo()           9 ? 0 ? 0 ? BF8A9948 ?
)+810
ktusmiut_init_ut()+  call     ktusmout_online_ut(  B000 ? 89E57F8 ? 9 ?
                                                   BF8A96B8 ?
348                           )
ktuini()+518         call     ktusmiut_init_ut()   0 ? 0 ? 0 ? 0 ? 1 ? 0 ?
adbdrv()+46345       call     ktuini()             0 ? BF8A9DE0 ? 1004BF42 ?
                                                   BF8AA258 ? B6A0BFFC ?
                                                   10EA1F20 ?
opiexe()+16835       call     adbdrv()             25C8F1F8 ? 0 ? 0 ? 2A34F528 ?
                                                   2A0400C8 ? BF8AF07C ?
opiosq0()+3437       call     opiexe()             4 ? 0 ? BF8AFE8C ?
kpooprx()+239        call     opiosq0()            3 ? E ? BF8B0184 ? A4 ? 0 ?
kpoal8()+607         call     kpooprx()            BF8B2D6C ? BF8B10AC ? 1D ?
                                                   1 ? 0 ? A4 ?
opiodr()+962         call     00000000             5E ? 1C ? BF8B2D68 ?
ttcpip()+1930        call     00000000             5E ? 1C ? BF8B2D68 ? 0 ?
opitsk()+1355        call     ttcpip()             11616580 ? 5E ? BF8B2D68 ?
                                                   0 ? BF8B29F8 ? BF8B2E90 ?
                                                   FDEBA80 ? 0 ? BF8B2EBC ?
opiino()+827         call     opitsk()             0 ? 0 ?
opiodr()+962         call     00000000             3C ? 4 ? BF8B3E2C ?
opidrv()+479         call     opiodr()             3C ? 4 ? BF8B3E2C ? 0 ?
sou2o()+80           call     opidrv()             3C ? 4 ? BF8B3E2C ?
opimai_real()+109    call     sou2o()              BF8B3E10 ? 3C ? 4 ?
                                                   BF8B3E2C ?
ssthrdmain()+212     call     00000000             2 ? BF8B3F58 ? 0 ? 47DA6F14 ?
                                                   BF8B3F14 ? 47D9A670 ?
main()+147           call     ssthrdmain()         2 ? BF8B3F58 ?
__libc_start_main()  call     00000000             2 ? BF8B4054 ? BF8B4060 ?
+220                                               47D9A828 ? 0 ? 1 ?
_start()+33          call     __libc_start_main()  856F3B4 ? 2 ? BF8B4054 ?
                                                   BCC1EA0 ? BCC1E90 ?
                                                   47D8C790 ?
--------------------- Binary Stack Dump ---------------------

MOS中有类似描述ORA-600 [25025] [25] While Opening the Clone Database in Resetlog Mode (Doc ID 603100.1),该解决方案是重建控制文件增加所有数据文件,在本次测试中,我就是人为除掉了undo,模拟undo丢失[其实数据库已经resetlogs过了,就算加入undo重建控制文件也不会成功(人工修改undo文件头除外)],又做了不正确的重建控制文件操作的故障,我提供解决方案如下

解决办法

--参数文件修改
undo_management='manual'
--尝试open数据库
recover database;
alter database open;
--新建undo
create undo tablespace undo_new datafile '' size 100m autoextend on next 10m maxsize 30G;
--屏蔽需要恢复回滚段
select tablespace_name,segment_name,status from dba_rollback_segs;
_corrupted_rollback_segments
--重启数据库使得_corrupted_rollback_segments生效
shutdown immediate;
startup
--删除老undo
drop tablespace old_undo
--修改参数
shutdonw immediate
undo_management='auto'
undo_tablespace='unod_new'
--启动数据库
startup
--导出数据,导入新库

姊妹篇:重建控制文件丢失undo异常恢复—ORA-01173模拟与恢复

ORA-600[2037]与ORA-07445[kcbs_dump_adv_state]错误

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600[2037]与ORA-07445[kcbs_dump_adv_state]错误

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一台win oracle 数据库,重启后发现数据库无法访问,检查发现是Bug 4899479,但是oracle未提供完整的解决方法,这里根据自己对于数据库启动过程的理解,通过屏蔽前滚和回滚,拉起来数据库
数据库版本平台信息

ORACLE:11.1.0.7
OS:WIN 2008 R2 X64

数据库启动报错

Tue Apr 16 12:36:31 2013
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 7 processes
Started redo scan
Completed redo scan
 28878 redo blocks read, 7353 data blocks need recovery
Started redo application at
 Thread 1: logseq 7960, block 14132
Recovery of Online Redo Log: Thread 1 Group 1 Seq 7960 Reading mem 0
  Mem# 0: D:\APP\SDWLJG-DB101\ORADATA\WLJG\REDO01.LOG
Tue Apr 16 12:36:32 2013
RECOVERY OF THREAD 1 STUCK AT BLOCK 915068 OF FILE 9
Hex dump of (file 9, block 1698691) in trace file c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\trace\wljg_p001_1500.trc
Corrupt block relative dba: 0x0259eb83 (file 9, block 1698691)
Bad header found during crash/instance recovery
Data in bad block:
 type: 0 format: 0 rdba: 0x0000a206
 last change scn: 0x2359.0259eb83 seq: 0xf7 flg: 0x0b
 spare1: 0x0 spare2: 0x0 spare3: 0x601
 consistency value in tail: 0x02c10243
 check value in block header: 0x0
 block checksum disabled
Reread of rdba: 0x0259eb83 (file 9, block 1698691) found valid data
Slave exiting with ORA-1172 exception
Errors in file c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\trace\wljg_p001_1500.trc:
ORA-01172: recovery of thread 1 stuck at block 915068 of file 9
ORA-01151: use media recovery to recover block, restore backup if needed
Tue Apr 16 12:36:32 2013
Errors in file c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\trace\wljg_p003_4088.trc  (incident=187558):
ORA-00600: internal error code, arguments: [2037], [12619645], [41474], [6], [1], [247], [12619645], [0], [], [], [], []
Incident details in: c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\incident\incdir_187558\wljg_p003_4088_i187558.trc
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+1352] [ACCESS_VIOLATION]
 [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x16BFD20] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [2037], [12619645], [41474], [6], [1], [247], [12619645], [0], [], [], [], []
Incident details in: c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\incident\incdir_187559\wljg_p003_4088_i187559.trc
Errors in file c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\trace\wljg_p006_1216.trc  (incident=187567):

这里提示file 9 block 915068异常,但是通过dbv检查发现file 9无任何坏块.

trace文件内容

Dump continued from file: c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\trace\wljg_p003_4088.trc
ORA-00600: internal error code, arguments: [2037], [12620930], [41474], [2], [1], [247], [12619645], [0], [], [], [], []
** DBGRL Error: ARB Alert Log
** DBGRL Error: <msg time='2013-04-16T11:05:58.522+08:00' org_id='oracle' comp_id='rdbms'
 msg_id='dbgexProcessError:1097:3370026720' type='TRACE' level='16'
 host_id='SDWLSCJG-DB' host_addr='172.18.1.15'>
 <txt>Incident details in: c:\app\sdwljg-db101\diag\rdbms\wljg\wlj
========= Dump for incident 129879 (ORA 600 [2037]) ========
*** 2013-04-16 11:05:58.522
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst1()+111        CALL???  skdstdst()+0         000000000 000000000 01CFC9B80
                                                   000000200
ksedst()+63          CALL???  ksedst1()+0          000000005 021B00600 005D30C80
                                                   000002004
dbkedDefDump()+1012  CALL???  ksedst()+0           000000000 000000000 000000000
                                                   000000000
ksedmp()+51          CALL???  dbkedDefDump()+0     000000003 000000002 021AF92C0
                                                   000405038
__PGOSF184_ksfdmp()  CALL???  ksedmp()+0           000000000 000000000 000000000
+27                                                27F00000000
dbgexPhaseII()+266   CALL???  __PGOSF184_ksfdmp()  00000000D 0082FAE50 000000000
                              +0                   000000004
dbgexProcessError()  CALL???  dbgexPhaseII()+0     021B00600 021AFCA50 000000201
+1313                                              000000000
dbgeExecuteForError  CALL???  dbgexProcessError()  021B00600 021B07590 000000001
()+55                         +0                   000000000
dbgePostErrorKGE()+  CALL???  dbgeExecuteForError  021AFCA30 021AFCA80 00000002E
1608                          ()+0                 000000005
dbkePostKGE_kgsf()+  CALL???  dbgePostErrorKGE()+  01CFC99D0 021B0E080 000000258
65                            0                    021B0E080
kgeade()+556         CALL???  dbkePostKGE_kgsf()+  000002000 000000000 000000009
                              0                    000000004
kgeriv_int()+105     CALL???  kgeade()+0           3A4F00000003 000C09482
                                                   0FFFFFFFF 000000000
kgeriv()+27          CALL???  kgeriv_int()+0       3A9A024E0 000000000 01CFC9410
                                                   000000000
kgesiv()+102         CALL???  kgeriv()+0           0000008D5 0000008C3 021AFD9A0
                                                   000AFDC73
ksesic7()+125        CALL???  kgesiv()+0           006371F20 000000007 27F912000
                                                   200000004
kcoexam()+248        CALL???  ksesic7()+0          2000007F5 000000000 000C09482
                                                   000000000
kcbtema()+2154       CALL???  kcoexam()+0          27FFC22C8 39E113470 3A940BBB8
                                                   000000000
kcrpap()+355         CALL???  kcbtema()+0          27FFC22C8 28BFC2628 000000000
                                                   021B10200
kcrpdv()+1655        CALL???  kcrpap()+0           021B101A0 000000002 000000004
                                                   000000512
kxfprdp()+1384       CALL???  kcrpdv()+0           3A7AD3098 000000000 00000000C
                                                   00757CF00
opirip()+1396        CALL???  kxfprdp()+0          00000001E 005CDB518 021AFF9E0
                                                   000000000
opidrv()+855         CALL???  opirip()+0           000000032 000000004 021AFFD30
                                                   000000000
sou2o()+52           CALL???  opidrv()+213         000000032 000000004 021AFFD30
                                                   021AFFDB0
opimai_real()+295    CALL???  sou2o()+0            000000000 7FEFD9819B5
                                                   000000000 000000000
opimai()+96          CALL???  opimai_real()+0      000000000 000000000 000000000
                                                   000000000
BackgroundThreadSta  CALL???  opimai()+0           021AFFE98 000000001 000000000
rt()+695                                           000000000
00000000775AF56D     CALL???  BackgroundThreadSta  00A26B7A0 000000000 000000000
                              rt()+0               000000000
0000000077923281     CALL???  00000000775AF560     000000000 000000000 000000000
                                                   000000000
--------------------- Binary Stack Dump ---------------------

查询mos发现During Startup (Open Database) Alert Log Shows ORA-600[2037] and ORA-7445[kcbs_dump_adv_state] [ID 551993.1]和我们这里展示的错误相符,引起该问题的原因主要是因为:The database may crash and fail to open due to undo/redo corruption if you are using distributed transactions.因为使用分布式事务的时候,数据库crash导致undo/redo corruption,从而使得数据库无法正常启动.

故障处理思路
因为通过数据库alert日志可以知道,数据库是在做前滚的时候并发进程失败,设置fast_start_parallel_rollback=false,禁止数据库实例恢复并发,可以恢复依然失败.因为前滚过不去,那就通过设置隐含参数禁止数据库前滚,在open数据库的过程中发现ora-600[2662]错误,推进scn,继续open数据库发现ora-600[4194],通过设置undo管理模式,屏蔽事务,屏蔽回滚段等方法,终于重新open库并重建undo,然后重建库算是完成恢复任务

rman制造坏块,bbed修复坏块

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:rman制造坏块,bbed修复坏块

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在rman中有隐藏的命令clear,可以标记数据块为corrupt,从而实现数据库坏块试验,本篇blog通过dbv,bbed,table select来验证坏块的出现和修复
创建测试表

SQL>  create table t_xifenfei as
  2   select object_id,object_name from dba_objects where rownum<10;
Table created.
SQL>  select rowid,
  2   dbms_rowid.rowid_relative_fno(rowid) rel_fno,
  3   dbms_rowid.rowid_block_number(rowid) block
  4   from chf.t_xifenfei;
ROWID                 REL_FNO      BLOCK
------------------ ---------- ----------
AAAStAAAEAAAACrAAA          4        171
AAAStAAAEAAAACrAAB          4        171
AAAStAAAEAAAACrAAC          4        171
AAAStAAAEAAAACrAAD          4        171
AAAStAAAEAAAACrAAE          4        171
AAAStAAAEAAAACrAAF          4        171
AAAStAAAEAAAACrAAG          4        171
AAAStAAAEAAAACrAAH          4        171
AAAStAAAEAAAACrAAI          4        171
9 rows selected.
SQL> alter system checkpoint;
System altered.

dbv验证坏块

[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY: Release 11.2.0.3.0 - Production on Sun Jan 20 09:12:16 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 7680
Total Pages Processed (Data) : 3776
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 167
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 744
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2993
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1006717955 (2956.1006717955)

这里创建了t_xifenfei表,数据存储在file 4 block 171中,现在该block一切正常,本试验就是要通过rman来使得该block corrupt,然后通过bbed来修复

bbed查看kcbh

BBED> set block 171
        BLOCK#          171
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x010000ab
   ub4 bas_kcbh                             @8        0x3c014bfe
   ub2 wrp_kcbh                             @12       0x0b8c
   ub1 seq_kcbh                             @14       0x02
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x6e0c  <--重点关注
   ub2 spare3_kcbh                          @18       0x0000

rman标记坏块

RMAN> BLOCKRECOVER DATAFILE 4 block 171 clear;
Starting recover at 20-JAN-13
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=18 device type=DISK
Finished recover at 20-JAN-13

dbv再次检查坏块

[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY: Release 11.2.0.3.0 - Production on Sun Jan 20 09:53:16 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/users01.dbf
Page 171 is marked corrupt
Corrupt block relative dba: 0x010000ab (file 4, block 171)
Bad check value found during dbv:
Data in bad block:
 type: 6 format: 2 rdba: 0x010000ab
 last change scn: 0x0b8c.3c014bfe seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bfe0602
 check value in block header: 0x6e0c
 computed block checksum: 0xb5bc
DBVERIFY - Verification complete
Total Pages Examined         : 7680
Total Pages Processed (Data) : 3775
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 167
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 744
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2993
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 1
Total Pages Encrypted        : 0
Highest block SCN            : 1006717955 (2956.1006717955)

尝试查询数据

SQL> select count(*) from t_xifenfei;
select count(*) from t_xifenfei
                     *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 171)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'

bbed验证坏块

BBED> set block 171
        BLOCK#          171
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x010000ab
   ub4 bas_kcbh                             @8        0x3c014bfe
   ub2 wrp_kcbh                             @12       0x0b8c
   ub1 seq_kcbh                             @14       0x02
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x6e0c  <--注意该值未变化
   ub2 spare3_kcbh                          @18       0x0000
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 171
Block 171 is corrupt
Corrupt block relative dba: 0x010000ab (file 0, block 171)
Bad check value found during verification
Data in bad block:
 type: 6 format: 2 rdba: 0x010000ab
 last change scn: 0x0b8c.3c014bfe seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bfe0602
 check value in block header: 0x6e0c
 computed block checksum: 0xb5bc
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED

这里通过dbv,bbed,select table都证明rman能够标记block为Corrupt.
这里需要分析:block已经被标记,那证明该块肯定有修改,也就是说chkval_kcbh一定要变化,但是这里没有变化,证明该处异常

bbed修复rman生成坏块

BBED> sum apply
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
Check value for File 0, Block 171:
current = 0xdbb0, required = 0xdbb0
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 171
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED

bbed测试坏块已经修复

dbv验证坏块

[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY: Release 11.2.0.3.0 - Production on Sun Jan 20 10:01:46 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 7680
Total Pages Processed (Data) : 3776
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 167
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 744
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2993
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1006717955 (2956.1006717955)

dbv测试坏块也被修复

查询表验证

SQL> select count(*) from t_xifenfei;
  COUNT(*)
----------
         9

数据也可以正常查询,证明rman的坏块问题通过bbed解决

总结说明
1.我们可以通过rman的clear命令来标记坏块(BLOCKRECOVER DATAFILE file# BLOCK block1#, block2#, block3#… CLEAR 😉
2.我们可以通过bbed的sum apply命令来修复该类型坏块

ORA-600 kghstack_free2异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600 kghstack_free2异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天接触一案例ORA-00600[kghstack_free2],恢复起来很简单,顺便记录下,供其他人遇到类似情况参考.

Wed Feb 20 15:05:04 2013
SMON: enabling cache recovery
Wed Feb 20 15:05:05 2013
Successfully onlined Undo Tablespace 1.
Wed Feb 20 15:05:05 2013
SMON: enabling tx recovery
Wed Feb 20 15:05:06 2013
Errors in file f:\oracle\product\10.2.0\admin\cksoltp\bdump\cksoltp_smon_3556.trc:
ORA-00600: internal error code, arguments: [kghstack_free2], [], [], [], [], [], [], []
Wed Feb 20 15:05:06 2013
Database Characterset is AL32UTF8
replication_dependency_tracking turned off (no async multimaster replication found)
Wed Feb 20 15:05:08 2013
Errors in file f:\oracle\product\10.2.0\admin\cksoltp\bdump\cksoltp_smon_3556.trc:
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [_kghstack_err+101]
[PC:0x603CC77F] [ADDR:0x59004E04] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kghstack_free2], [], [], [], [], [], [], []
Starting background process QMNC
QMNC started with pid=18, OS id=3504
Wed Feb 20 15:05:13 2013
Errors in file f:\oracle\product\10.2.0\admin\cksoltp\bdump\cksoltp_pmon_4064.trc:
ORA-00474: SMON process terminated with error

这个错误很明显是因为smon在回滚事务的时候,遇到异常从而出现ORA-00600[kghstack_free2]错误,使得数据库不能被正常open,通过分析trace文件发现回滚段36中有事务未提交,需要回滚,针对这样的数据库恢复方法很简单:1.屏蔽事务回滚,2.屏蔽回滚段强制offline等
1) EVENT=”10513 trace name context forever”
2)_offline_rollback_segments= _SYSSMU36$
数据库启动后,通过分析trace文件,找出来异常对象,然后重建该对象,除掉event/_offline_rollback_segments即可

通过rowid获取segment header坏块数据

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:通过rowid获取segment header坏块数据

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在上篇(table中各种坏块对select/dml操作影响)中说到如果segment header异常了,不能通过ctas来获得相关数据,在群的讨论中,EZIO说到可以通过rowid方式来获得相关数据,通过测试证明,确实可以通过该方法获得数据,以后遇到此类错误,大家也不必惊慌.
创建测试表

SQL> create table t_xifenfei
  2  as
  3  select * from dba_objects where object_id is not null;
SQL> alter table t_xifenfei
  2  add constraint PK_t_xifenfei primary key (object_id)
  3  ;
Table altered.
SQL> alter system checkpoint;
System altered.
SQL> select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74762
SQL> select header_file,header_block from
  2  DBA_SEGments where segment_name='T_XIFENFEI' AND OWNER='CHF';
HEADER_FILE HEADER_BLOCK
----------- ------------
          4          170

dump block

alter system dump datafile 4 block 170;
Dump of buffer cache at level 4 for tsn=4 rdba=16777386
Block dump from disk:
buffer tsn: 4 rdba: 0x010000aa (4/170)
scn: 0x0b8c.3c0092e4 seq: 0x01 flg: 0x04 tail: 0x91e42301
frmt: 0x02 chkval: 0xa531 type: 0x23=PAGETABLE SEGMENT HEADER

通过header_block和dump block确定block 170即为PAGETABLE SEGMENT HEADER

bbed制造SEGMENT HEADER坏块

BBED> set block 170
        BLOCK#          170
BBED> set offset 8188
        OFFSET          8188
BBED> d
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 170              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0123e492
 <32 bytes per line>
BBED> m /x 0123e491
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 170              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0123e491
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 170:
current = 0xa531, required = 0xa531
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 170
Block 170 is corrupt
Corrupt block relative dba: 0x010000aa (file 0, block 170)
Fractured block found during verification
Data in bad block:
 type: 35 format: 2 rdba: 0x010000aa
 last change scn: 0x0b8c.3c0092e4 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x91e42301
 check value in block header: 0xa531
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED

测试segment header坏块后select操作

SQL> select * from chf.t_xifenfei;
select * from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 170)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
SQL> select /*+index(t PK_T_XIFENFEI)*/  count(rowid) from chf.t_xifenfei t;
COUNT(ROWID)
------------
       74762

基于rowid获取segment header 坏块对象数据

SQL> create table chf.bad_rows (table_name varchar2(60),
   2 row_id rowid, oracle_error_code number);
Table created.
SQL> DECLARE
  2   TYPE RowIDTab IS TABLE OF ROWID INDEX BY BINARY_INTEGER;
  3   CURSOR c1 IS  select /*+index(t PK_T_XIFENFEI)*/ rowid from chf.t_xifenfei t;
  4   r RowIDTab;
  5   rows  NATURAL := 20000;
  6   bad_rows number := 0 ;
  7   errors number;
  8   error_code number;
  9   myrowid rowid;
 10  BEGIN
 11   OPEN c1;
 12   LOOP
 13     FETCH  c1 BULK COLLECT INTO r LIMIT rows;
 14     EXIT WHEN r.count=0;
 15     BEGIN
 16      FORALL i IN r.FIRST..r.LAST SAVE EXCEPTIONS
 17       insert into chf.t_xifenfei_new
 18       select /*+ ROWID(A) */ *
 19       from chf.t_xifenfei A where rowid = r(i);
 20     EXCEPTION
 21     when OTHERS then
 22      BEGIN
 23       errors := SQL%BULK_EXCEPTIONS.COUNT;
 24       FOR err1 IN 1..errors LOOP
 25           error_code := SQL%BULK_EXCEPTIONS(err1).ERROR_CODE;
 26           myrowid := r(SQL%BULK_EXCEPTIONS(err1).ERROR_INDEX);
 27           bad_rows := bad_rows + 1;
 28           insert into chf.bad_rows values('chf.t_xifenfei',myrowid, error_code);
 29       END LOOP;
 30       END;
 31     END;
 32    commit;
 33   END LOOP;
 34   commit;
 35   CLOSE c1;
 36   dbms_output.put_line('Total Bad Rows: '||bad_rows);
 37  END;
 38  /
Total Bad Rows: 0
PL/SQL procedure successfully completed.
SQL> select count(*) from chf.t_xifenfei_new;
  COUNT(*)
----------
     74762

通过上面pl/sql,基于rowid成功获得segment header 异常对象中的所有数据记录.如果没有主键的表出现该问题,可以参考:使用plsql抢救数据

table中各种类型block坏块是否能被跳过

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:table中各种类型block坏块是否能被跳过

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在table遇到的各种坏块中,大部分情况,我们都可以通过设置event 10231或者dbms_repair来跳过坏块,抢救其他数据;但是在部分情况下,我们设置了他们依然不能跳过坏块,数据库依然报ORA-01578,本文测试了table中各种类型的block,证明在哪些blog出现异常之后不能被跳过.
如果是事务数据块出现坏块可以通过dbms_repair.skip_corrupt_blocks来标记坏块(也可以使用event 10231)来实现;对于BITMAP BLOCK如果出现坏块不会对于读取数据无影响(至于其他操作,请见后续blog);SEGMENT HEADER如果出现坏块,无法通过跳过坏块来实现获取其他数据(正常block)
创建测试表

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> create table t_xifenfei
  2  tablespace users
  3  as
  4  select * from dba_objects;
Table created.
SQL>  select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74663

查询相关block信息

SQL> select SEGMENT_NAME,HEADER_FILE,HEADER_BLOCK,blocks,extents from DBA_SEGMENTS
  2  WHERE OWNER='CHF' AND SEGMENT_NAME='T_XIFENFEI';
SEGMENT_NAME    HEADER_FILE HEADER_BLOCK     BLOCKS    EXTENTS
--------------- ----------- ------------ ---------- ----------
T_XIFENFEI                4          378       1152         24
SQL>   select
  2    dbms_rowid.rowid_relative_fno(rowid)rel_fno,
  3    max(dbms_rowid.rowid_block_number(rowid)) max_block,
  4    min(dbms_rowid.rowid_block_number(rowid)) min_block
  5    from chf.t_xifenfei
  6    group by dbms_rowid.rowid_relative_fno(rowid);
   REL_FNO  MAX_BLOCK  MIN_BLOCK
---------- ---------- ----------
         4       1728        379
SQL> select EXTENT_ID,FILE_ID,BLOCK_ID,blocks from dba_extents where owner='CHF'
   2  AND SEGMENT_NAME='T_XIFENFEI';
 EXTENT_ID    FILE_ID   BLOCK_ID     BLOCKS
---------- ---------- ---------- ----------
         0          4        376          8
         1          4        640          8
         2          4        648          8
         3          4        656          8
         4          4        664          8
         5          4        672          8
         6          4        680          8
         7          4        688          8
         8          4        696          8
         9          4        704          8
        10          4        712          8
        11          4        720          8
        12          4        728          8
        13          4        736          8
        14          4        744          8
        15          4        752          8
        16          4        768        128
        17          4        896        128
        18          4       1024        128
        19          4       1152        128
        20          4       1280        128
        21          4       1408        128
        22          4       1536        128
        23          4       1664        128

通过这里可以知道:真正的存储数据是从block 379开始,至于block 376、377、378是什么,使用dump block分析

验证block类型

SQL> alter system dump datafile 4 block 376;
System altered.
SQL> alter system dump datafile 4 block 377;
System altered.
SQL> alter system dump datafile 4 block 378;
System altered.
SQL> alter system dump datafile 4 block 379;
System altered.
--该block是另外extent的开始,所以也尝试分析是否有特殊之处
SQL> alter system dump datafile 4 block 640;
System altered.
--dump 文件header信息
Start dump data blocks tsn: 4 file#:4 minblk 376 maxblk 376
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777592
Block dump from disk:
buffer tsn: 4 rdba: 0x01000178 (4/376)
scn: 0x0b8c.3bfc6517 seq: 0x04 flg: 0x04 tail: 0x65172004
frmt: 0x02 chkval: 0xc8b3 type: 0x20=FIRST LEVEL BITMAP BLOCK
Start dump data blocks tsn: 4 file#:4 minblk 377 maxblk 377
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777593
Block dump from disk:
buffer tsn: 4 rdba: 0x01000179 (4/377)
scn: 0x0b8c.3bfc6517 seq: 0x18 flg: 0x04 tail: 0x65172118
frmt: 0x02 chkval: 0x9e8c type: 0x21=SECOND LEVEL BITMAP BLOCK
Start dump data blocks tsn: 4 file#:4 minblk 378 maxblk 378
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777594
BH (0x2a7f7f0c) file#: 4 rdba: 0x0100017a (4/378) class: 4 ba: 0x2a742000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 1 pwc: 0,0
  dbwrid: 0 obj: 76372 objn: 76372 tsn: 4 afn: 4 hint: f
  hash: [0x3150a748,0x3150a748] lru: [0x2a7f8094,0x2a7f7ee4]
  lru-flags: hot_buffer
  ckptq: [NULL] fileq: [NULL] objq: [0x2f72dc34,0x2f72dc34] objaq: [0x2f72dc2c,0x2f72dc2c]
  st: XCURRENT md: NULL fpin: 'ktewh25: kteinicnt' tch: 1
  flags:
  LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
Block dump from disk:
buffer tsn: 4 rdba: 0x0100017a (4/378)
scn: 0x0b8c.3bfc651b seq: 0x01 flg: 0x04 tail: 0x651b2301
frmt: 0x02 chkval: 0xb2ae type: 0x23=PAGETABLE SEGMENT HEADER
Start dump data blocks tsn: 4 file#:4 minblk 379 maxblk 379
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777595
Block dump from disk:
buffer tsn: 4 rdba: 0x0100017b (4/379)
scn: 0x0b8c.3bfc6494 seq: 0x01 flg: 0x04 tail: 0x64940601
frmt: 0x02 chkval: 0x0567 type: 0x06=trans data
Start dump data blocks tsn: 4 file#:4 minblk 640 maxblk 640
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777856
Block dump from disk:
buffer tsn: 4 rdba: 0x01000280 (4/640)
scn: 0x0b8c.3bfc6496 seq: 0x01 flg: 0x04 tail: 0x64960601
frmt: 0x02 chkval: 0x0efe type: 0x06=trans data

这里可以知道:
1.block 376、377为BITMAP BLOCK
2.block 378为SEGMENT HEADER(和dba_segments视图中一致)
3.除extent 0中有特殊(含BITMAP BLOCK和SEGMENT HEADER)block,其他extent只包含事务数据

测试block 640

--block 640包含条数
SQL> select   count(rowid)
  2    from chf.t_xifenfei
  3     where dbms_rowid.rowid_block_number(rowid)=640
  4  and dbms_rowid.rowid_relative_fno(rowid)=4;
COUNT(ROWID)
------------
          79
--bbed修改tailchk
BBED> set filename '/u01/oracle/oradata/ora11g/users01.dbf'
        FILENAME        /u01/oracle/oradata/ora11g/users01.dbf
BBED> set block 640
        BLOCK#          640
BBED> set mode edit
        MODE            Edit
BBED> p tailchk
ub4 tailchk                                 @8188     0x64960601
BBED> m /x 64960602
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 640              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 64960602
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 640:
current = 0xf80b, required = 0xf80b
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 640
Block 640 is corrupt
Corrupt block relative dba: 0x01000280 (file 0, block 640)
Fractured block found during verification
Data in bad block:
 type: 6 format: 2 rdba: 0x01000280
 last change scn: 0x0b8c.3bfc6496 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x02069664
 check value in block header: 0xf80b
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
--查询坏块
SQL> select count(*) from chf.t_xifenfei;
select count(*) from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 640)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
--跳过坏块
SQL> exec dbms_repair.skip_corrupt_blocks('CHF','T_XIFENFEI');
PL/SQL procedure successfully completed.
SQL>  select skip_corrupt from dba_tables where table_name='T_XIFENFEI' AND OWNER='CHF';
SKIP_COR
--------
ENABLED
SQL>  select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74584
--修复坏块
BBED> m /x 01069664
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 640              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 01069664
 <32 bytes per line>
BBED> p tailchk
ub4 tailchk                                 @8188     0x64960601
BBED> sum apply
Check value for File 0, Block 640:
current = 0x0efe, required = 0x0efe
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 640
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED
--除掉标记表坏块
SQL> BEGIN
  2  DBMS_REPAIR.SKIP_CORRUPT_BLOCKS (
  3  SCHEMA_NAME => 'CHF',
  4  OBJECT_NAME => 'T_XIFENFEI',
  5  OBJECT_TYPE => dbms_repair.table_object,
  6  FLAGS => dbms_repair.NOSKIP_FLAG);
  7  END;
  8  /
PL/SQL procedure successfully completed.
SQL> select skip_corrupt from dba_tables where table_name='T_XIFENFEI' AND OWNER='CHF';
SKIP_COR
--------
DISABLED
--查询表记录正常
SQL>  select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74663

在后续的操作中,也是按照类似步骤操作,考虑到篇幅有限,部分过程不再贴出来

测试block 379

SQL>  select count(*) from chf.t_xifenfei;
 select count(*) from chf.t_xifenfei
                          *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 379)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
SQL> exec dbms_repair.skip_corrupt_blocks('CHF','T_XIFENFEI');
PL/SQL procedure successfully completed.
SQL> select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74575

测试block 378

BBED> set block 378
        BLOCK#          378
--segment header 不支持bbed查看结构
BBED> p tailchk
BBED-00400: invalid blocktype (35)
BBED> map
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 378                                   Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (35)
BBED> set offset 8188
        OFFSET          8188
BBED> d
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 378              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 01231b65
 <32 bytes per line>
BBED> m /x 651b2302
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 378              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 651b2302
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 378:
current = 0xedf2, required = 0xedf2
--验证坏块
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 378
Block 378 is corrupt
Corrupt block relative dba: 0x0100017a (file 0, block 378)
Fractured block found during verification
Data in bad block:
 type: 35 format: 2 rdba: 0x0100017a
 last change scn: 0x0b8c.3bfc651b seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x02231b65
 check value in block header: 0xedf2
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
SQL>  select count(*) from chf.t_xifenfei;
 select count(*) from chf.t_xifenfei
                          *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 378)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
--标记跳过坏块
SQL> exec dbms_repair.skip_corrupt_blocks('CHF','T_XIFENFEI');
PL/SQL procedure successfully completed.
--查询依然失败
SQL> select count(*) from chf.t_xifenfei;
select count(*) from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 378)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'

测试block 377

BBED> m /x 18211766
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 377              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 18211766
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 377:
current = 0x9d8c, required = 0x9d8c
--bbed验证为坏块
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 377
Block 377 is corrupt
Corrupt block relative dba: 0x01000179 (file 0, block 377)
Fractured block found during verification
Data in bad block:
 type: 33 format: 2 rdba: 0x01000179
 last change scn: 0x0b8c.3bfc6517 seq: 0x18 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x66172118
 check value in block header: 0x9d8c
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
--dbv验证为坏块
[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY: Release 11.2.0.3.0 - Production on Fri Jan 18 03:32:18 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/users01.dbf
Page 377 is influx - most likely media corrupt
Corrupt block relative dba: 0x01000179 (file 4, block 377)
Fractured block found during dbv:
Data in bad block:
 type: 33 format: 2 rdba: 0x01000179
 last change scn: 0x0b8c.3bfc6517 seq: 0x18 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x66172118
 check value in block header: 0x9d8c
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Pages Examined         : 2560
Total Pages Processed (Data) : 1434
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 10
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 213
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 902
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 1
Total Pages Encrypted        : 0
Highest block SCN            : 1006486374 (2956.1006486374)
--查询表记录
SQL> select skip_corrupt from dba_tables where table_name='T_XIFENFEI' AND OWNER='CHF';
SKIP_COR
--------
DISABLED
SQL> select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74663

测试block 376

BBED> m /x 04201766
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 376              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 04201766
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 376:
current = 0xcbb3, required = 0xcbb3
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 376
Block 376 is corrupt
Corrupt block relative dba: 0x01000178 (file 0, block 376)
Fractured block found during verification
Data in bad block:
 type: 32 format: 2 rdba: 0x01000178
 last change scn: 0x0b8c.3bfc6517 seq: 0x4 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x66172004
 check value in block header: 0xcbb3
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
SQL>  select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74663

通过测试证明,如果是事务数据块出现坏块可以通过dbms_repair.skip_corrupt_blocks来标记坏块(也可以使用event 10231)来实现;对于BITMAP BLOCK如果出现坏块不会对于读取数据无影响(至于其他操作,请见table中各种坏块对select/dml操作影响);SEGMENT HEADER如果出现坏块,无法通过跳过坏块来实现获取其他数据(正常block)

recover遇到坏块处理本质探讨

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:recover遇到坏块处理本质探讨

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

如果在还原出来的数据文件中有坏块,而归档日志和联机日志是正常的,那么在应用日志恢复过程中,会出现什么情况,这里通过一个简单的测试给予其中一种情况的说明
创建测试表

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL> create table t_xifenfei(object_id,object_name) tablespace xifenfei
  2  as
  3  select object_id,object_name from dba_objects
  4  where rownum<11;
Table created.
SQL> col object_name for a30
SQL> select   object_id,object_name,
  2   dbms_rowid.rowid_relative_fno(rowid)rel_fno,
  3  dbms_rowid.rowid_block_number(rowid)blockno
  4   from chf.t_xifenfei;
 OBJECT_ID OBJECT_NAME                       REL_FNO    BLOCKNO
---------- ------------------------------ ---------- ----------
        20 ICOL$                                   5         12
        44 I_USER1                                 5         12
        28 CON$                                    5         12
        15 UNDO$                                   5         12
        29 C_COBJ#                                 5         12
         3 I_OBJ#                                  5         12
        25 PROXY_ROLE_DATA$                        5         12
        39 I_IND1                                  5         12
        51 I_CDEF2                                 5         12
        26 I_PROXY_ROLE_DATA$_1                    5         12
10 rows selected.
SQL> select name from v$datafile where file#=5;
NAME
--------------------------------------------------------------
/u01/oracle/oradata/XFF/xifenfei01.dbf
SQL> update t_xifenfei set object_name='WWW.XIFENFEI.COM';
10 rows updated.
SQL> commit;
Commit complete.
SQL> create table t_xifenfei_new(object_id,object_name) tablespace xifenfei
  2  as
  3  select object_id,object_name from dba_objects
  4  where rownum<11;
Table created.
SQL> conn / as sysdba
Connected.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

备份数据文件

[oracle@xifenfei XFF]$ cp xifenfei01.dbf  ../tmp/
[oracle@xifenfei XFF]$ ll ../tmp/xifenfei01.dbf
-rw-r----- 1 oracle oinstall 10493952 Sep 28 19:05 ../tmp/xifenfei01.dbf
[oracle@xifenfei XFF]$ date
Fri Sep 28 19:05:42 CST 2012

bbed破坏备份文件

[oracle@xifenfei XFF]$ bbed password=blockedit
BBED: Release 2.0.0.0.0 - Limited Production on Fri Sep 28 19:05:59 2012
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename '/u01/oracle/oradata/tmp/xifenfei01.dbf'
        FILENAME        /u01/oracle/oradata/tmp/xifenfei01.dbf
BBED> set block 12
        BLOCK#          12
BBED> set mode edit
        MODE            Edit
BBED> map
 File: /u01/oracle/oradata/tmp/xifenfei01.dbf (0)
 Block: 12                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 96 bytes                     @20
 struct kdbh, 14 bytes                      @124
 struct kdbt[1], 4 bytes                    @138
 sb2 kdbr[10]                               @142
 ub1 freespace[7666]                        @162
 ub1 rowdata[360]                           @7828
 ub4 tailchk                                @8188
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x0140000c
   ub4 bas_kcbh                             @8        0x0004d7b0
   ub2 wrp_kcbh                             @12       0x000a
   ub1 seq_kcbh                             @14       0x01
   ub1 flg_kcbh                             @15       0x06 (KCBHFDLC, KCBHFCKV)
   ub2 chkval_kcbh                          @16       0xe573
   ub2 spare3_kcbh                          @18       0x0000
BBED> d offset 8188
 File: /u01/oracle/oradata/tmp/xifenfei01.dbf (0)
 Block: 12               Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0106b0d7
 <32 bytes per line>
BBED> m /x 11
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/tmp/xifenfei01.dbf (0)
 Block: 12               Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 1106b0d7
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 12:
current = 0xe563, required = 0xe563
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/tmp/xifenfei01.dbf
BLOCK = 12
Block 12 is corrupt
Corrupt block relative dba: 0x0140000c (file 0, block 12)
Fractured block found during verification
Data in bad block:
 type: 6 format: 2 rdba: 0x0140000c
 last change scn: 0x000a.0004d7b0 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xd7b00611
 check value in block header: 0xe563
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2

修改数据库记录

SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             109054428 bytes
Database Buffers          201326592 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> conn chf/xifenfei
Connected.
SQL> update t_xifenfei set object_name='惜分飞';
10 rows updated.
SQL> update t_xifenfei_new set object_name='惜分飞';
10 rows updated.
SQL> commit;
Commit complete.
SQL> alter system switch logfile;
System altered.
SQL> alter system switch logfile;
System altered.
SQL> alter system switch logfile;
System altered.
SQL> conn / as sysdba
Connected.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

利用备份数据文件恢复数据库

[oracle@xifenfei XFF]$ cp xifenfei01.dbf xifenfei01.dbf_bak
[oracle@xifenfei XFF]$ cp ../tmp/xifenfei01.dbf  xifenfei01.dbf
[oracle@xifenfei XFF]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Fri Sep 28 19:13:59 2012
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             109054428 bytes
Database Buffers          201326592 bytes
Redo Buffers                7118848 bytes
Database mounted.
ORA-01113: file 5 needs media recovery
ORA-01110: data file 5: '/u01/oracle/oradata/XFF/xifenfei01.dbf'
--提示数据需要恢复
SQL> recover datafile 5;
ORA-00279: change 42949990720 generated at 09/28/2012 19:04:10 needed for
thread 1
ORA-00289: suggestion : /u01/oracle/oradata/XFF/archivelog/1_24_792679299.dbf
ORA-00280: change 42949990720 for thread 1 is in sequence #24
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
auto
Log applied.
Media recovery complete.
SQL> alter database open;
Database altered.
--利用被破坏的数据文件+归档日志恢复数据库正常
SQL> col object_name for a30
SQL> select   object_id,object_name from t_xifenfei;
select   object_id,object_name from t_xifenfei
                                    *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 5, block # 12)
ORA-01110: data file 5: '/u01/oracle/oradata/XFF/xifenfei01.dbf'
--提示被破坏的数据块,查询不能完成
--证明坏块之外的数据块还是被正常应用日志
SQL>  select   object_id,object_name from t_xifenfei_new;
 OBJECT_ID OBJECT_NAME
---------- ------------------------------
        20 惜分飞
        44 惜分飞
        28 惜分飞
        15 惜分飞
        29 惜分飞
         3 惜分飞
        25 惜分飞
        39 惜分飞
        51 惜分飞
        26 惜分飞
10 rows selected.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

dbv检查坏块

[oracle@xifenfei XFF]$ dbv file=xifenfei01.dbf
DBVERIFY: Release 10.2.0.4.0 - Production on Fri Sep 28 19:14:52 2012
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = xifenfei01.dbf
DBV-00200: Block, DBA 20971532, already marked corrupt
--这里可以看出来,该数据块已经被标志为坏块
DBVERIFY - Verification complete
Total Pages Examined         : 1280
Total Pages Processed (Data) : 2
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 14
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1264
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Highest block SCN            : 318700 (10.318700)

查看恢复过程alert日志

Fri Sep 28 19:14:06 2012
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Fri Sep 28 19:14:06 2012
ALTER DATABASE OPEN
ORA-1113 signalled during: ALTER DATABASE OPEN...
Fri Sep 28 19:14:11 2012
ALTER DATABASE RECOVER  datafile 5
Media Recovery Start
 parallel recovery started with 2 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  datafile 5  ...
Fri Sep 28 19:14:16 2012
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Fri Sep 28 19:14:16 2012
--恢复数据库的时候,发现坏块
Media Recovery Log /u01/oracle/oradata/XFF/archivelog/1_24_792679299.dbf
Fri Sep 28 19:14:16 2012
Hex dump of (file 5, block 12) in trace file /u01/oracle/admin/XFF/bdump/xff_p001_23011.trc
Corrupt block relative dba: 0x0140000c (file 5, block 12)
Fractured block found during media recovery
Data in bad block:
 type: 6 format: 2 rdba: 0x0140000c
 last change scn: 0x000a.0004d7b0 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xd7b00611
 check value in block header: 0xe563
 computed block checksum: 0x0
Reread of rdba: 0x0140000c (file 5, block 12) found same corrupted data
--继续恢复
Fri Sep 28 19:14:16 2012
Recovery of Online Redo Log: Thread 1 Group 1 Seq 25 Reading mem 0
  Mem# 0: /u01/oracle/oradata/XFF/redo01.log
Fri Sep 28 19:14:16 2012
Recovery of Online Redo Log: Thread 1 Group 2 Seq 26 Reading mem 0
  Mem# 0: /u01/oracle/oradata/XFF/redo02.log
Fri Sep 28 19:14:16 2012
Recovery of Online Redo Log: Thread 1 Group 3 Seq 27 Reading mem 0
  Mem# 0: /u01/oracle/oradata/XFF/redo03.log
Fri Sep 28 19:14:16 2012
Media Recovery Complete (XFF)
Completed: ALTER DATABASE RECOVER    CONTINUE DEFAULT
Fri Sep 28 19:14:31 2012
alter database open

bbed查看修改相关信息

BBED> set filename '/u01/oracle/oradata/XFF/xifenfei01.dbf'
        FILENAME        /u01/oracle/oradata/XFF/xifenfei01.dbf
BBED> set block 12
        BLOCK#          12
BBED> map
 File: /u01/oracle/oradata/XFF/xifenfei01.dbf (0)
 Block: 12                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 96 bytes                     @20
 struct kdbh, 14 bytes                      @124
 struct kdbt[1], 4 bytes                    @138
 sb2 kdbr[10]                               @142
 ub1 freespace[7666]                        @162
 ub1 rowdata[360]                           @7828
 ub4 tailchk                                @8188
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x0140000c
   ub4 bas_kcbh                             @8        0x00000000
   ub2 wrp_kcbh                             @12       0x0000
   ub1 seq_kcbh                             @14       0xff   <--因为被标记为坏块,所以为ff
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0xe77d
   ub2 spare3_kcbh                          @18       0x0000
--查看数据块中记录
BBED> p *kdbr[5]
rowdata[69]
-----------
ub1 rowdata[69]                             @7897     0x2c
BBED> x /rnc
rowdata[69]                                 @7897
-----------
flag@7897: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@7898: 0x02
cols@7899:    2
col    0[2] @7900: 3
col   1[16] @7903: WWW.XIFENFEI.COM  <--确实没有被恢复,而是直接被跳过
BBED> set mode edit
        MODE            Edit
BBED> m /x 01 offset 14
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/XFF/xifenfei01.dbf (0)
 Block: 12               Offsets:   14 to  525           Dba:0x00000000
------------------------------------------------------------------------
 01047de7 00000100 00000dcc 000098d7 …………
 <32 bytes per line>
BBED> d offset 8188
 File: /u01/oracle/oradata/XFF/xifenfei01.dbf (0)
 Block: 12               Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 ff060000    <--这么说明:数据块被标志为坏块的时候,同时会修改tailchk值
 <32 bytes per line>
BBED> m /x 01 offset 8188
 File: /u01/oracle/oradata/XFF/xifenfei01.dbf (0)
 Block: 12               Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 01060000
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 12:
current = 0xe77d, required = 0xe77d
--验证块已经标记为正常块
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/XFF/xifenfei01.dbf
BLOCK = 12
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0

启动数据库测试

SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             109054428 bytes
Database Buffers          201326592 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> conn chf/xifenfei
Connected.
SQL> col object_name for a30
SQL> select   object_id,object_name from t_xifenfei;
 OBJECT_ID OBJECT_NAME
---------- ------------------------------
        20 WWW.XIFENFEI.COM
        44 WWW.XIFENFEI.COM
        28 WWW.XIFENFEI.COM
        15 WWW.XIFENFEI.COM
        29 WWW.XIFENFEI.COM
         3 WWW.XIFENFEI.COM
        25 WWW.XIFENFEI.COM
        39 WWW.XIFENFEI.COM
        51 WWW.XIFENFEI.COM
        26 WWW.XIFENFEI.COM
10 rows selected.
--通过修改数据块的seq_kcbh和tailchk,让这个块恢复正常,但是记录依然丢失,
--因为应用日志恢复之时标记为坏块跳过该块的日志应用

通过实验证明:
1.如果只有数据块异常,应用日志恢复,不一定会出现ORA-600[3020],而是直接把该块标记为坏块,继续应用日志
2.标记坏块其实就是修改seq_kcbh为ff,同时也修改tailchk值
3.经验值:如果在数据库应用日志恢复的时候,如果出现ORA-600[3020]错误,可以使用allow 2 corruption来跳过坏块处理,其实也是修改seq_kcbh为ff,然后让数据库跳过该块的恢复.

使用plsql抢救数据

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:使用plsql抢救数据

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

“在oracle出现ORA-8103/ORA-1578/ORA-376″等情况下抢救数据的争论没有停止过,很多人想到的是使用bbed,dul等工具来抢救,其实在很多时候我们使用pl/sql也可以完美的抢救数据.在这里我们通过模拟ORA-8103错误,然后使用plsql来找回数据.这中处理方法相对于bbed风险小,但是缺点是如果数据量大处理时间可能比较长,可能比dul有的一比,但是dul的工具不是任何人都有的.所以整体来说,在大部分情况下,这种方法处理某个数据块错误,抢救某个对象数据,还是很好的方法.
1.有非空列index情况

--创建测试表
SQL> create table xifenfei
  2  as
  3  select * from dba_objects;
Table created.
--修改某个项为非空值
SQL> alter table xifenfei modify object_id not null;
Table altered.
--创建一个唯一index
SQL> create unique index ind_xifenfei  on xifenfei(object_id);
Index created.
--表总记录
SQL> select count(*) from xifenfei;
  COUNT(*)
----------
     50088
--extent的分布情况
SQL> set pages 100
SQL>  select file_id,block_id,block_id+blocks-1
  2    from dba_extents
  3   where segment_name ='XIFENFEI' AND owner='CHF';
   FILE_ID   BLOCK_ID BLOCK_ID+BLOCKS-1
---------- ---------- -----------------
         9       1545              1552
         9       1553              1560
         9       1561              1568
         9       1569              1576
         9       1577              1584
         9       1585              1592
         9       1593              1600
         9       1601              1608
         9       1609              1616
         9       1617              1624
         9       1625              1632
         9       1633              1640
         9       1641              1648
         9       1649              1656
         9       1657              1664
         9       1665              1672
         9       1673              1800
         9       1801              1928
         9       1929              2056
         9       2057              2184
         9       2185              2312
21 rows selected.
--2200数据块包含记录
SQL> select   count(*)
  2  from chf.xifenfei where dbms_rowid.rowid_block_number(rowid)=2200;
  COUNT(*)
----------
        69
--关闭数据库
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
--破坏数据块
[oracle@xifenfei ~]$ dd if=/dev/zero of=/u01/oracle/oradata/XFF/users03.dbf  bs=8192  count=1 seek=2200 conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000151554 seconds, 54.1 MB/s
--启动数据库
SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             104860124 bytes
Database Buffers          205520896 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
--查询结果
SQL>  select /*+ full(xifenfei) */ count(*) from chf.xifenfei;
 select /*+ full(xifenfei) */ count(*) from chf.xifenfei
                                                *
ERROR at line 1:
ORA-08103: object no longer exists
SQL> create table chf.xifenfei_new
  2  as
  3  select * from chf.xifenfei;
select * from chf.xifenfei
                  *
ERROR at line 3:
ORA-08103: object no longer exists
--创建备份表
SQL> create table chf.xifenfei_new
  2  as
  3  select * from chf.xifenfei where 1=0;
Table created.
--创建坏块相关rowid记录表
SQL> create table chf.bad_rows (row_id rowid, oracle_error_code number);
Table created.
--执行plsql脚本
DECLARE
 TYPE RowIDTab IS TABLE OF ROWID INDEX BY BINARY_INTEGER;
 CURSOR c1 IS  select /*+ index(xifenfei ind_xifenfei) */ rowid
 from chf.xifenfei
 where object_id is NOT NULL;
 r RowIDTab;
 rows  NATURAL := 20000;
 bad_rows number := 0 ;
 errors number;
 error_code number;
 myrowid rowid;
BEGIN
 OPEN c1;
 LOOP
   FETCH  c1 BULK COLLECT INTO r LIMIT rows;
   EXIT WHEN r.count=0;
   BEGIN
    FORALL i IN r.FIRST..r.LAST SAVE EXCEPTIONS
     insert into chf.xifenfei_new
     select /*+ ROWID(A) */ *
     from chf.xifenfei A where rowid = r(i);
   EXCEPTION
   when OTHERS then
    BEGIN
     errors := SQL%BULK_EXCEPTIONS.COUNT;
     FOR err1 IN 1..errors LOOP
       error_code := SQL%BULK_EXCEPTIONS(err1).ERROR_CODE;
       if error_code in (1410, 8103) then
         myrowid := r(SQL%BULK_EXCEPTIONS(err1).ERROR_INDEX);
         bad_rows := bad_rows + 1;
         insert into chf.bad_rows values(myrowid, error_code);
       else
         raise;
       end if;
     END LOOP;
     END;
   END;
  commit;
 END LOOP;
 commit;
 CLOSE c1;
 dbms_output.put_line('Total Bad Rows: '||bad_rows);
END;
/
--查询错误记录
SQL> select count(*) from chf.bad_rows ;
  COUNT(*)
----------
        69
SQL> select * from chf.bad_rows where rownum<10;
ROW_ID             ORACLE_ERROR_CODE
------------------ -----------------
AAAMugAAJAAAAiYAAA              8103
AAAMugAAJAAAAiYAAB              8103
AAAMugAAJAAAAiYAAC              8103
AAAMugAAJAAAAiYAAD              8103
AAAMugAAJAAAAiYAAE              8103
AAAMugAAJAAAAiYAAF              8103
AAAMugAAJAAAAiYAAG              8103
AAAMugAAJAAAAiYAAH              8103
AAAMugAAJAAAAiYAAI              8103
9 rows selected.
--查询备份表记录
SQL> select count(*) from chf.xifenfei_new;
  COUNT(*)
----------
     50019
50088-50019=69和被破坏块中记录一致,证明所有好块中记录全部被找回来

2.无非空列index情况

--创建表
SQL> CONN CHF/XIFENFEI
Connected.
SQL> create table t_xifenfei
  2  as
  3  select * from dba_objects;
Table created.
--表中记录总数
SQL> select count(*) from t_xifenfei;
  COUNT(*)
----------
     50086
--extent分布
SQL> SET PAGES 100
SQL>  select file_id,block_id,block_id+blocks-1
  2    from dba_extents
  3   where segment_name ='T_XIFENFEI' AND owner='CHF';
   FILE_ID   BLOCK_ID BLOCK_ID+BLOCKS-1
---------- ---------- -----------------
         9          9                16
         9         17                24
         9         25                32
         9         33                40
         9         41                48
         9         49                56
         9         57                64
         9         65                72
         9         73                80
         9         81                88
         9         89                96
         9         97               104
         9        105               112
         9        113               120
         9        121               128
         9        129               136
         9        137               264
         9        265               392
         9        393               520
         9        521               648
         9        649               776
21 rows selected.
--700数据块中记录数
SQL> select   count(*)
  2  from chf.t_xifenfei where dbms_rowid.rowid_block_number(rowid)=700;
  COUNT(*)
----------
        73
--关闭数据库
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
--破坏block 700的数据块
[oracle@xifenfei ~]$ dd if=/dev/zero of=/u01/oracle/oradata/XFF/users03.dbf  bs=8192  count=1 seek=700 conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000156576 seconds, 52.3 MB/s
--启动数据库
SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             104860124 bytes
Database Buffers          205520896 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
--查询报错
SQL> select count(*) from chf.t_xifenfei;
select count(*) from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-08103: object no longer exists
--创建备份表
SQL> CREATE TABLE T_XIFENFEI_NEW
  2  AS
  3  SELECT * FROM T_XIFENFEI WHERE 1=0;
--找回记录
set serveroutput on
set concat off
DECLARE
 nrows number;
 rid rowid;
 dobj number;
 ROWSPERBLOCK number;
BEGIN
 ROWSPERBLOCK:=1000;  --估算最大的一个块中记录条数
 nrows:=0;
 select data_object_id  into dobj
 from dba_objects
 where owner = 'CHF'
 and object_name = 'T_XIFENFEI'
-- and subobject_name = '<table partition>'  Add this condition if table is partitioned
 ;
 for i in (select relative_fno, block_id, block_id+blocks-1 totblocks
           from dba_extents
           where owner = 'CHF'
             and segment_name = 'T_XIFENFEI'
-- and partition_name = '<table partition>' Add this condition if table is partitioned
-- and file_id != <OFFLINED DATAFILE> This condition is only used if a datafile needs to be skipped due to ORA-376 (A)
          order by extent_id)
 loop
   for br in i.block_id..i.totblocks loop
    for j in 1..ROWSPERBLOCK loop
    begin
      rid := dbms_rowid.ROWID_CREATE(1,dobj,i.relative_fno, br , j-1);
      insert into CHF.T_XIFENFEI_NEW
      select /*+ ROWID(A) */ *
      from CHF.T_XIFENFEI A
      where rowid = rid;
      if sql%rowcount = 1 then nrows:=nrows+1; end if;
      if (mod(nrows,10000)=0) then commit; end if;
    exception when others then null;
    end;
    end loop;
  end loop;
 end loop;
 COMMIT;
 dbms_output.put_line('Total rows: '||to_char(nrows));
END;
/
--找回记录数
SQL> SELECT COUNT(*) FROM CHF.T_XIFENFEI_NEW;
  COUNT(*)
----------
     50013
50086-50013=73  证明非坏块中的数据都被完全寻找回来

参考:
ORA-8103 Troubleshooting, Diagnostic and Solution [ID 268302.1]
Extract rows from a CORRUPT table creating ROWID from DBA_EXTENTS [ID 422547.1]

TXChecker初试

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:TXChecker初试

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

当我们对数据库进行异常恢复,很多时候要选择屏蔽回滚段,但是我们有没有办法来评估屏蔽回滚段到底会对我们的数据库的数据产生多大影响呢?其实我们可以通过TXChecker工具来评估数据库undo异常时候受到影响的数据对象有哪些,从而进一步确定是否真的需要对其undo下手处理.
模拟数据异常事务为提交情况

[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Thu Aug 30 20:32:27 2012
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to an idle instance.
SQL> startup;
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             100665820 bytes
Database Buffers          209715200 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> create table t_xifenfei tablespace system
  2  as
  3  select * from dba_objects;
Table created.
SQL> create table chf.t_xifenfei tablespace users
  2  as
  3  select * from dba_objects;
Table created.
SQL> delete from t_xifenfei where rownum<10;
9 rows deleted.
SQL> delete from chf.t_xifenfei where rownum<100;
99 rows deleted.
SQL> alter system checkpoint;
System altered.
SQL> shutdow abort;
ORACLE instance shut down.

在system和users表空间分别模拟了sys和chf的t_xifenfei表含事务未提交情况

TXChecker用法说明

[oracle@xifenfei txchecker]$ ./TXChecker
TXChecker - v1.4 by Center Of Expertise (COE), Oracle Corporation (build 10/16/07)
Usage is: TXChecker [options]
Options:
  -a                     When scanning datafiles (with -d/-f/-l/-t options) report objects using any of the undo
                         segments (not just those with errors) (OPTIONAL)
  -b                     For objects found, print the datablock addresses. See readme for further details (OPTIONAL)
  -c<controlfile_name>   Fully qualified controlfile name to read (MANDATORY)
  -d                     Scan database for active TXs (use when undo not available) (OPTIONAL)
  -f<filename>           Scan the named datafile for active TXs (OPTIONAL)
  -g                     Indicates you want to find all blocks taking part in transactions with
                         a USN > than the USN supplied in -x<XID> parameter (same constraints as -w) (OPTIONAL)
  -l<listfile>           Scan all the datafiles listed in the listfile for active TXs (OPTIONAL)
  -m<minutes>            Number of minutes used to consider a TX as active (1-120) (DEFAULTS TO 60 MINUTES)
  -p                     Show the names and last known status of the UNDO segments (OPTIONAL)
  -s                     Skip read-only or offline normal datafiles (OPTIONAL)
  -t<tablespace>         Scan all the datafiles for this tablespace (OPTIONAL)
  -u                     Report ITL entries active if marked with an upper bound ('U' flag) fast commit SCN
                         instead if active transactions (OPTIONAL)
  -w<wrap#>              Wrap# for XID in ITL entry to report blocks where wrap# > this one (OPTIONAL)
                         Must use -x with this option. See the readme for details
  -x<XID>                XID for transaction wanting to search for (OPTIONAL)
                         Use format rrrr.ssss.wwwwwwww using Hexadecimal numbers
                         See the readme for full instructions on using -x, -w and -g options
NOTE: Options -d/-f/-l/-t are exclusive, and only one should be specified.
[/sql]
<strong>TXChecker初始</strong>

[oracle@xifenfei txchecker]$ ./TXChecker -c/u01/oracle/oradata/XFF/control01.ctl -d -a
TXChecker - v1.4 by Center Of Expertise (COE), Oracle Corporation (build 10/16/07)
Database Name: XFF   Version: 10.2.0
*** Database last checkpointed at 08/30/2012 20:34:43 (SCN: 0xa.0x1e142)
*** Using 60 minutes to find most active transactions (-m60)
*** Undo Segments:
USN:   0  Name: SYSTEM          TBS#:  0 File:   1  Block:      9  Instance:  0  SMU: N  Status:  3 - Online
                                SCN: 0000.00000000   XactSQN: 0x00000000   UndoSQN: 0x00000000
USN:   1  Name: _SYSSMU1$       TBS#:  1 File:   2  Block:      9  Instance:  0  SMU: Y  Status:  3 - Online
                                SCN: 000a.000191c3   XactSQN: 0x000000d2   UndoSQN: 0x0000013c
…………省略
USN:   9  Name: _SYSSMU9$       TBS#:  1 File:   2  Block:    137  Instance:  0  SMU: Y  Status:  3 - Online
                                SCN: 000a.000191d8   XactSQN: 0x0000011a   UndoSQN: 0x000000dc
USN:  10  Name: _SYSSMU10$      TBS#:  1 File:   2  Block:    153  Instance:  0  SMU: Y  Status:  3 - Online
                                SCN: 000a.000191d7   XactSQN: 0x000000c1   UndoSQN: 0x000000d9
…………省略
USN:  20  Name: _SYSSMU20$      TBS#:  5 File:   5  Block:    153  Instance:  0  SMU: Y  Status:  1 - Invalid / Dropped
                                SCN: 0000.00070ec6   XactSQN: 0x00000002   UndoSQN: 0x00000001
*** Active Transactions:
USN: 6  Name: _SYSSMU6$  File: 2  Block: 89  Instance: 0  Status: 3 - Online
* Active TX at slot 6  #undo blocks: 3  Last bk: 2.90
Obj#:  51938  Name: CHF.T_XIFENFEI                                    Type: TABLE               Undo recs: 99
              Used undo segment IDs: 6
Obj#:  51937  Name: SYS.T_XIFENFEI                                    Type: TABLE               Undo recs: 9
              Used undo segment IDs: 6
         File (validate_objects.sql) being created for object validattion already exists
         and will be overwritten!!!
         Do you want to continue overwriting [Y/N]?y
*** Undo segments (headers) that encountered errors preventing Active TX scan:
USN:  11  Name: _SYSSMU11$      File:   5  Block:       9  Instance:  0  Error: 27 - Undo segment was dropped
…………省略
USN:  20  Name: _SYSSMU20$      File:   5  Block:     153  Instance:  0  Error: 27 - Undo segment was dropped
WARNING: Analyzing the full database for active transactions will take some time!
         Are you sure you want to analyze the full database [Y/N]?
         Are you sure you want to analyze the full database [Y/N]?y
*** Scanning database for datablocks that may require undo (PLEASE WAIT...):
*** Asterisk ('*') denotes blocks being updated since 08/08/2012 03:30:32 (SCN: 0x0.0x79607)
Scanning datafile:  3 - /u01/oracle/oradata/XFF/sysaux01.dbf (SYSAUX)                           - Active TX blocks: 147 *
Scanning datafile:  1 - /u01/oracle/oradata/XFF/system01.dbf (SYSTEM)                           - Active TX blocks: 11597 *
--undo表空间跳过
Undo datafile (/u01/oracle/oradata/XFF/undotbs01.dbf) - SKIPPING
Scanning datafile:  4 - /u01/oracle/oradata/XFF/users01.dbf (USERS)                             - Active TX blocks: 2 *
--因为数据文件丢失控制文件中offline的跳过(其实只要数据文件丢失就会跳过)
Cannot access datafile (/u01/oracle/oradata/XFF/xifenfei01.dbf) (error 2 - No such file or directory) - SKIPPING
Cannot access datafile (/u01/oracle/oradata/XFF/xifenfei02.dbf) (error 2 - No such file or directory) - SKIPPING
Scanning datafile:  7 - /u01/oracle/oradata/XFF/xifenfei03.dbf (XIFENFEI2)                      - Active TX blocks: 0
*** Objects that may require undo data:
*** Asterisk ('*') denotes blocks being updated since 08/08/2012 03:30:32 (SCN: 0x0.0x79607)
DataObj#:  51938  Name: CHF.T_XIFENFEI                                    Type: TABLE               Datablocks: 2 *
Used undo segment IDs: 6
DataObj#:  46434  Name: MDSYS.SYS_IL0000046432C00006$$                    Type: INDEX               Datablocks: 4
Used undo segment IDs: 2
DataObj#:    124  Name: SYS.I_ACCESS1                                     Type: INDEX               Datablocks: 27
Used undo segment IDs: 1 2 3 4 5 6 7 8 9 10
…………省略
DataObj#:   8824  Name: SYS.SYS_IL0000008822C00008$$                      Type: INDEX               Datablocks: 1 *
Used undo segment IDs: 4
DataObj#:  51937  Name: SYS.T_XIFENFEI                                    Type: TABLE               Datablocks: 1 *
Used undo segment IDs: 6
DataObj#:  51557  Name: SYS.UTL_RECOMP_COMPILED                           Type: TABLE               Datablocks: 1
Used undo segment IDs: 8
…………省略
DataObj#:  42131  Name: XDB.XDB$ELEMENT_PROPNUMBER                        Type: INDEX               Datablocks: 1
Used undo segment IDs: 2
*** Use validate_objects.sql script file to validate the structure of possibly corrupt objects
    if the undo required is not available.
Undo Segment Usage Summary
**************************
*** Undo segments identified in use by active transaction datablocks AFTER 08/08/2012 03:30:32 (SCN: 0x0.0x79607):
USN:   1  Name: _SYSSMU1$
USN:   2  Name: _SYSSMU2$
USN:   3  Name: _SYSSMU3$
USN:   4  Name: _SYSSMU4$
USN:   5  Name: _SYSSMU5$
USN:   6  Name: _SYSSMU6$
USN:   9  Name: _SYSSMU9$
*** Undo segments identified in use by active transacation datablocks BEFORE 08/08/2012 03:30:32 (SCN: 0x0.0x79607):
USN:   1  Name: _SYSSMU1$
USN:   2  Name: _SYSSMU2$
USN:   3  Name: _SYSSMU3$
USN:   4  Name: _SYSSMU4$
USN:   5  Name: _SYSSMU5$
USN:   7  Name: _SYSSMU7$
USN:   8  Name: _SYSSMU8$
USN:   9  Name: _SYSSMU9$
USN:  10  Name: _SYSSMU10$
NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!
*************************************************************************************
Upload the logfile (TXChecker_083012_2042_XFF.log) to Oracle Support Services for analysis.
Do NOT attempt to force the database open until the logfile has been analyzed.

验证对象脚本

[oracle@xifenfei txchecker]$ more validate_objects.sql
rem validate_objects.sql - checks strcuture of objects needing unavailable undo data
rem                        Created by findtxns program, Oracle Corporation
set echo on
ANALYZE TABLE CHF.T_XIFENFEI VALIDATE STRUCTURE CASCADE;
ANALYZE INDEX MDSYS.SYS_IL0000046432C00006$$ VALIDATE STRUCTURE;
ANALYZE INDEX SYS.I_ACCESS1 VALIDATE STRUCTURE;
…………省略
ANALYZE INDEX XDB.SYS_C003167 VALIDATE STRUCTURE;
ANALYZE INDEX XDB.XDB$ELEMENT_PROPNAME VALIDATE STRUCTURE;
ANALYZE INDEX XDB.XDB$ELEMENT_PROPNUMBER VALIDATE STRUCTURE;