Database SOS – 第118页 – Professional Database Recovery Support@+8617813235971

ORA-30013导致RAC 节点down掉

今天一朋友让我帮忙分析他们的9.2.0.2 rac 节点2异常down掉原因,相关信息如下:
前提信息

OS:HP-UX B.11.31
DB:9.2.0.2.0 RAC

节点2alert日志信息

Fri Sep  7 13:13:49 2012
ARC0: Completed archiving  log 11 thread 2 sequence 11651
Fri Sep  7 13:31:56 2012
Errors in file /oracle/admin/agent/udump/agent2_ora_797.trc:
ORA-00600: internal error code, arguments: [kgavsd_3], [0], [], [], [], [], [], []
ORA-00028: your session has been killed
Fri Sep  7 13:31:58 2012
Errors in file /oracle/admin/agent/bdump/agent2_pmon_5938.trc:
ORA-30013: undo tablespace 'UNDOTBS1' is currently in use
Fri Sep  7 13:31:58 2012
PMON: terminating instance due to error 30013
Fri Sep  7 13:31:58 2012
Errors in file /oracle/admin/agent/bdump/agent2_lms7_6033.trc:
ORA-30013: undo tablespace '' is currently in use
Fri Sep  7 13:31:58 2012
…………
Errors in file /oracle/admin/agent/bdump/agent2_lms0_6027.trc:
ORA-30013: undo tablespace '' is currently in use
Fri Sep  7 13:31:58 2012
System state dump is made for local instance
Fri Sep  7 13:32:03 2012
Instance terminated by PMON, pid = 5938
Fri Sep  7 14:34:35 2012

这里可以看到因为ORA-30013的错误使得pmon进程异常,从而使得该rac的节点2 down掉.同时这里还发现了ORA-00600[kgavsd_3]错误,是否是因为该ORA-600导致了数据库异常down还是一个偶然机会,我们继续分析

查看ORA-600[kgavsd_3]相关trace文件

*********START PLSQL RUNTIME DUMP************
***Got ORA-28 while running PLSQL***
***********END PLSQL RUNTIME DUMP************
*** 2012-09-07 13:31:56.740
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kgavsd_3], [0], [], [], [], [], [], []
ORA-00028: your session has been killed
Current SQL statement for this session:
--用户补档
DECLARE
  OUT_ERR_CODE NUMBER;
  OUT_ERR_MSG  VARCHAR2(1000);
  V_COUNT NUMBER;
BEGIN
    WHILE TRUE LOOP
      SELECT COUNT(*) INTO V_COUNT FROM amc_stat_log  where proc_name in('pRunOdsChannelWareData')
       and run_param=201208 AND STATE='A';
      IF V_COUNT>0 THEN
         dbms_output.put_line('exit loop '|| sysdate);
         EXIT;
      END IF;
      sys.Dbms_Lock.sleep(600);
      dbms_output.put_line('wake up '|| sysdate);
    END LOOP;
   PKG_AME_ODS_DATA.P_Add_TO_AgentServ(201208,OUT_ERR_CODE,OUT_ERR_MSG);
   PKG_AME_ODS_DATA.P_Update_Serv_Ware_ID(201208, OUT_ERR_CODE, OUT_ERR_MSG);
   PKG_AMS_SETTLE.P_COMMISION_51PRE_FLAG(201208,OUT_ERR_CODE, OUT_ERR_MSG);
END;
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
c000000e84ec59d0        14  anonymous block
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
Cannot find symbol in .
Cannot find symbol in .
Cannot find symbol in .
ksedmp()+512         call     9fffffffffff3940     000000000 ?
                                                   C000000000000A17 ?
                                                   40000000025B6540 ?
ksfdmp()+64          call     9fffffffffff3940     000000003 ?
kgerinv()+352        call     9fffffffffff3940     60000000000466B0 ?
                                                   000000003 ?
                                                   C000000000000714 ?
                                                   4000000004EBA6A0 ?
                                                   00001821B ?
                                                   6000000000468EA8 ?
kgesinv()+48         call     9fffffffffff3940     60000000000466B0 ?
                                                   600000000059BD98 ?
                                                   600000000046B070 ?
                                                   60000000000179C0 ?
                                                   6000000000017950 ?
kgesin()+112         call     9fffffffffff3940     60000000000466B0 ?
                                                   600000000059BD98 ?
                                                   4000000000B44C10 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFF4310 ?
$cold_kgavsd_stackl  call     9fffffffffff3940     60000000000466B0 ?
et_done()+1184                                     600000000059BD98 ?
                                                   4000000000B44C10 ?
                                                   000000001 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   00001FE19 ?
pbesd_stacklet_done  call     9fffffffffff3940     60000000000466B0 ?
()+96                                              000000002 ? 000000000 ?
                                                   9FFFFFFFBEC6AE68 ?
pfrrun()+3328        call     9fffffffffff3940     9FFFFFFFBEC6AE68 ?
                                                   C000000000001D42 ?
                                                   4000000001ABBDA0 ?
                                                   9FFFFFFFBEC6B06C ?
                                                   9FFFFFFFFFFF64F0 ?
                                                   6000000000468EA8 ?
                                                   000000000 ? 000000000 ?
peicnt()+480         call     0000000000000000     9FFFFFFFBEC6AEE8 ?
                                                   C000000000000996 ?
                                                   40000000019F3680 ?
kkxexe()+816         call     0000000000000000     9FFFFFFFFFFF64F0 ?
                                                   9FFFFFFFBEC6AE68 ?
                                                   C00000000000099B ?
                                                   4000000001DD90F0 ?
                                                   00000FE4F ?
                                                   9FFFFFFFFFFF5F00 ?
                                                   60000000000467F0 ?
                                                   4000000000B603F0 ?
opiexe()+11168       call     0000000000000000     000000000 ?
                                                   C000000000002C60 ?
                                                   4000000001BEF980 ?
                                                   00000800F ?
                                                   9FFFFFFFFFFF6470 ?
                                                   9FFFFFFFBEC6AEB2 ?
                                                   6000000000040900 ?
                                                   9FFFFFFFBEC6B534 ?
opiall0()+3184       call     _etext_f()+23058430  000000004 ? 0000000C4 ?
                              09110686928          9FFFFFFFFFFF7B40 ?
                                                   C000000000002BDF ?
                                                   4000000001B26CD0 ?
                                                   000000000 ? 00000C893 ?
                                                   9FFFFFFFFFFF6690 ?
Cannot find symbol in .
kpoal8()+2064        call     9fffffffffff7ad0     000000001 ?
                                                   9FFFFFFFFFFF8304 ?
                                                   FFFFFFFFBFFFFFFF ?
                                                   9FFFFFFFFFFF83E4 ?
                                                   FFFFFFFFFFE7FBDF ?
                                                   9FFFFFFFFFFF7B88 ?
                                                   000000000 ?
                                                   6000000000474528 ?
opiodr()+3584        call     9fffffffffff81fc     6000000000040950 ?
                                                   000000000 ? 000000000 ?
                                                   C000000000002C60 ?
                                                   4000000001C09FE0 ?
                                                   00000C50B ?
                                                   9FFFFFFFFFFF81F0 ?
                                                   9FFFFFFFFFFF81D8 ?
ttcpip()+3776        call     _etext_f()+23058430  00000005E ? 000000014 ?
                              09114957288          9FFFFFFFFFFFA5F0 ?
                                                   6000000000040918 ?
                                                   C000000000001ABD ?
                                                   4000000001AB3BA0 ?
                                                   000000000 ? 00000C59B ?
opitsk()+1872        call     9fffffffffffa200     6000000000049FC0 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFFA5F0 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFFA740 ?
                                                   9FFFFFFFFFFFA564 ?
                                                   9FFFFFFFBF780058 ?
                                                   000000000 ?
opiino()+3184        call     000000000000057b     000000000 ? 000000000 ?
                                                   C00000000000132B ?
                                                   4000000001F78730 ?
                                                   000008001 ?
opiodr()+3584        call     0000000000000000     6000000000548A38 ?
                                                   4000000000B606F0 ?
                                                   6000000000548A38 ?
                                                   C000000000002C60 ?
                                                   4000000001C09FE0 ?
                                                   00000A201 ?
                                                   9FFFFFFFFFFFBC90 ?
                                                   4000000000B606F0 ?
opidrv()+976         call     _etext_f()+23058430  00000003C ? 000000004 ?
                              09114957288          9FFFFFFFFFFFEFB0 ?
                                                   6000000000040918 ?
sou2o()+80           call     _etext_f()+23058430  000000004 ? 000000004 ?
                              09114957288          9FFFFFFFFFFFEFB0 ?
main()+352           call     _etext_f()+23058430  9FFFFFFFFFFFEFD0 ?
                              09114957288          9FFFFFFFFFFFEFD4 ?
                                                   60000000004744F0 ?
                                                   9FFFFFFFFFFFEFB0 ?
main_opd_entry()+80  call     _etext_f()+23058430  000000000 ?
                              09114957288          9FFFFFFFFFFFF498 ?
                                                   C000000000000004 ?
                                                   C00000000002BE30 ?
--------------------- Binary Stack Dump ---------------------
Process global information:
     process: c000000d6428c0c0, call: c000000e46e772a8, xact: 0000000000000000,
     curses: c000000d6437d020, usrses: c000000d6437d020
  ----------------------------------------
  SO: c000000d6428c0c0, type: 2, owner: 0000000000000000, flag: INIT/-/-/0x00
  (process) Oracle pid=282, calls cur/top: c000000e46e772a8/c000000e46e772a8, flag: (0) -
            int error: 28, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 0 0 0
              last post received-location: No post
              last process to post me: c000000d64234f18 1 6
              last post sent: 0 0 104
              last post sent-location: kglpsl: in loop
              last process posted by me: c000000d6428e900 23 0
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: c000000d62234ee0
    O/S info: user: oracle, term: UNKNOWN, ospid: 797
    OSD pid info: Unix process pid: 797, image: oracle@gzagent2 (TNS V1-V3)
    ----------------------------------------
    SO: c000000d6437d020, type: 4, owner: c000000d6428c0c0, flag: INIT/-/-/0x00
    (session) trans: 0000000000000000, creator: c000000d6428c0c0, flag: (41) USR/- BSY/-/-/-/KIL/-
              DID: 0000-0000-00000000, short-term DID: 0000-0000-00000000
              txn branch: 0000000000000000
              oct: 0, prv: 0, sql: c000000e76b03f10, psql: 0000000000000000, user: 31/CUSTOM
    O/S info: user: huangqianhai_lc, term: SVCTAG-D1MLV2X, ospid: 11124:11796, machine: WORKGROUP\SVCTAG-D1MLV2X
              program: plsqldev.exe
    application name: PL/SQL Developer, hash value=1190136663
    action name: 测试窗口 - 新建, hash value=3604520210
    last wait for 'null event' blocking sess=0x0 seq=142 wait_time=567341620
                =ea60, =0, =0
    temporary object counter: 0
      ----------------------------------------

通过这里可以看出来是因为pl/sql dev进行一个plsql的操作导致该错误发生,查询MOS[ID 403575.1]发现

Applies to:
Oracle Server - Enterprise Edition - Version: 9.2.0.7 and later   [Release: 9.2 and later ]
Information in this document applies to any platform.
***Checked for relevance on 27-Oct-2010***
Symptoms
The following errors appears in the alert log file :
Probe:read_pipe: receive failed, status 3
Probe:S:debug_loop: timeout. Action 1
*********START PLSQL RUNTIME DUMP************
***Got ORA-604 while running PLSQL***
***********END PLSQL RUNTIME DUMP************
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kgavsd_3], [0], [], [], [], [], [], []
ORA-00604: error occurred at recursive SQL level 2
Current SQL statement for this session:
begin :id := sys.dbms_transaction.local_transaction_id; end;
.
Cause
The ora-600 kgavsd_3 appears when calling kgavsd_stacklet_done function which is related to PLSQL DEBUG.
From traces, dbms_debug package is being used during trace generation.
The return code of 3 further indicates that the dbms_pipe message was interrupted.
Probably user cancelled a plsql program, so the appeared while trying to dump the stack
Solution
There is no data corruption over here.
The error appears to be due to abnormal termination of aPL/SQL Developer application while executing a PL/SQL block.
Changing the PL/SQL and/or the procedure code could help in avoiding this error message.
Hence, this error can be safely ignored.

查找trace文件确实发现有name=SYS.DBMS_DEBUG,进一步表明该错误是由于plsql dev工具使用debug模式运行上面的plsql而引起该错误的发生,但是因为mos中记录和错误不是完全的一致,所以不能十分确定是该错误导致数据库down掉

继续分析ORA-30013

Error:      ORA-30013  (ORA-30013)
Text:      undo tablespace '%s' is currently in use
---------------------------------------------------------------------------
Cause:    the specified undo tablespace is currently used by another instance.
Action:    Wait for the undo tablespace to become available or change to another name and reissue the statement.

这个说明是没有疑问的:因为2节点配置的当前undo是UNDOTBS2,而UNDOTBS1是1节点使用的,证明这里的undo确实发生了错误,继续查询mos发现Bug 3368552

Hdr: 3368552 9.2.0.3 RDBMS 9.2.0.3 RAC PRODID-5 PORTID-23
Abstract: RAC:  ORA-30013 WHEN INSTANCE 2 ATTEMPTS TO ACCESS UNDO TABLESPACE OF INSTANCE 1
*** 01/12/04 06:21 am ***
TAR:
----
3554549.995
PROBLEM:
--------
The RAC database has been stable, but experienced an instance termination due
to ORA-30031 error in the alert log (instance 2):
...
Tue Dec 23 03:01:46 2003
ARC1: Evaluating archive  log 4 thread 2 sequence 1116
ARC1: Beginning to archive log 4 thread 2 sequence 1116
Creating archive destination LOG_ARCHIVE_DEST_1:
'/oracle/oradata/VLDB/logs/archives/VLDBN2/VLDB_0000001116_0002.arc'
ARC1: Completed archiving  log 4 thread 2 sequence 1116
Tue Dec 23 08:14:09 2003
Errors in file /oracle/admin/VLDB/bdump/vldbn2_pmon_22860.trc:
ORA-30013: undo tablespace 'UNDOTBS1' is currently in use
Tue Dec 23 08:14:09 2003
PMON: terminating instance due to error 30013
Tue Dec 23 08:14:10 2003
System state dump is made for local instance
Tue Dec 23 08:14:12 2003
Trace dumping is performing id=[cdmp_20031223081410]
Tue Dec 23 08:14:14 2003
Instance terminated by PMON, pid = 22860
<eof>
Instance 1 alert log shows only the reconfiguration and the cdump info:
..
Tue Dec 23 03:54:13 2003
Errors in file /oracle/admin/VLDB/udump/vldbn1_ora_13564.trc:
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 4
Tue Dec 23 08:14:10 2003
Trace dumping is performing id=[cdmp_20031223081410]
Tue Dec 23 08:14:12 2003
Reconfiguration started
List of nodes: 0,
Global Resource Directory frozen
one node partition
Communication channels reestablished
...

因为在9.2.0.3的RAC中有着该bug,那么我们可以大胆猜测在9.2.0.2中应该存在该bug,那么结合上面的ORA-00600[kgavsd_3]错误,我们大概还原该事故的全部:
1.节点1 dml操作了程序中报错的plsql中要范围的部分表对象,但是未提交(或者正在执行)
2.节点2 有用户使用pl/sql dev去执行程序中的plsql,因为是debug模式执行,需要UNDOTBS1的块来构建cr,从而使得节点2去访问UNDOTBS1,引发了Bug 3368552 从而使得数据库直接kill掉该plsql dev会话,进而出现ORA-00600[kgavsd_3]错误和pmon进程异常使得节点2 down掉

密码保护：dul实现expdp dump文件转换sqlldr格式

TXChecker初试

当我们对数据库进行异常恢复,很多时候要选择屏蔽回滚段,但是我们有没有办法来评估屏蔽回滚段到底会对我们的数据库的数据产生多大影响呢？其实我们可以通过TXChecker工具来评估数据库undo异常时候受到影响的数据对象有哪些,从而进一步确定是否真的需要对其undo下手处理.
模拟数据异常事务为提交情况

[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Thu Aug 30 20:32:27 2012
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to an idle instance.
SQL> startup;
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             100665820 bytes
Database Buffers          209715200 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> create table t_xifenfei tablespace system
  2  as
  3  select * from dba_objects;
Table created.
SQL> create table chf.t_xifenfei tablespace users
  2  as
  3  select * from dba_objects;
Table created.
SQL> delete from t_xifenfei where rownum<10;
9 rows deleted.
SQL> delete from chf.t_xifenfei where rownum<100;
99 rows deleted.
SQL> alter system checkpoint;
System altered.
SQL> shutdow abort;
ORACLE instance shut down.

在system和users表空间分别模拟了sys和chf的t_xifenfei表含事务未提交情况

TXChecker用法说明

[oracle@xifenfei txchecker]$ ./TXChecker
TXChecker - v1.4 by Center Of Expertise (COE), Oracle Corporation (build 10/16/07)
Usage is: TXChecker [options]
Options:
  -a                     When scanning datafiles (with -d/-f/-l/-t options) report objects using any of the undo
                         segments (not just those with errors) (OPTIONAL)
  -b                     For objects found, print the datablock addresses. See readme for further details (OPTIONAL)
  -c<controlfile_name>   Fully qualified controlfile name to read (MANDATORY)
  -d                     Scan database for active TXs (use when undo not available) (OPTIONAL)
  -f<filename>           Scan the named datafile for active TXs (OPTIONAL)
  -g                     Indicates you want to find all blocks taking part in transactions with
                         a USN > than the USN supplied in -x<XID> parameter (same constraints as -w) (OPTIONAL)
  -l<listfile>           Scan all the datafiles listed in the listfile for active TXs (OPTIONAL)
  -m<minutes>            Number of minutes used to consider a TX as active (1-120) (DEFAULTS TO 60 MINUTES)
  -p                     Show the names and last known status of the UNDO segments (OPTIONAL)
  -s                     Skip read-only or offline normal datafiles (OPTIONAL)
  -t<tablespace>         Scan all the datafiles for this tablespace (OPTIONAL)
  -u                     Report ITL entries active if marked with an upper bound ('U' flag) fast commit SCN
                         instead if active transactions (OPTIONAL)
  -w<wrap#>              Wrap# for XID in ITL entry to report blocks where wrap# > this one (OPTIONAL)
                         Must use -x with this option. See the readme for details
  -x<XID>                XID for transaction wanting to search for (OPTIONAL)
                         Use format rrrr.ssss.wwwwwwww using Hexadecimal numbers
                         See the readme for full instructions on using -x, -w and -g options
NOTE: Options -d/-f/-l/-t are exclusive, and only one should be specified.
[/sql]
<strong>TXChecker初始</strong>

[oracle@xifenfei txchecker]$ ./TXChecker -c/u01/oracle/oradata/XFF/control01.ctl -d -a
TXChecker - v1.4 by Center Of Expertise (COE), Oracle Corporation (build 10/16/07)
Database Name: XFF   Version: 10.2.0
*** Database last checkpointed at 08/30/2012 20:34:43 (SCN: 0xa.0x1e142)
*** Using 60 minutes to find most active transactions (-m60)
*** Undo Segments:
USN:   0  Name: SYSTEM          TBS#:  0 File:   1  Block:      9  Instance:  0  SMU: N  Status:  3 - Online
                                SCN: 0000.00000000   XactSQN: 0x00000000   UndoSQN: 0x00000000
USN:   1  Name: _SYSSMU1$       TBS#:  1 File:   2  Block:      9  Instance:  0  SMU: Y  Status:  3 - Online
                                SCN: 000a.000191c3   XactSQN: 0x000000d2   UndoSQN: 0x0000013c
…………省略
USN:   9  Name: _SYSSMU9$       TBS#:  1 File:   2  Block:    137  Instance:  0  SMU: Y  Status:  3 - Online
                                SCN: 000a.000191d8   XactSQN: 0x0000011a   UndoSQN: 0x000000dc
USN:  10  Name: _SYSSMU10$      TBS#:  1 File:   2  Block:    153  Instance:  0  SMU: Y  Status:  3 - Online
                                SCN: 000a.000191d7   XactSQN: 0x000000c1   UndoSQN: 0x000000d9
…………省略
USN:  20  Name: _SYSSMU20$      TBS#:  5 File:   5  Block:    153  Instance:  0  SMU: Y  Status:  1 - Invalid / Dropped
                                SCN: 0000.00070ec6   XactSQN: 0x00000002   UndoSQN: 0x00000001
*** Active Transactions:
USN: 6  Name: _SYSSMU6$  File: 2  Block: 89  Instance: 0  Status: 3 - Online
* Active TX at slot 6  #undo blocks: 3  Last bk: 2.90
Obj#:  51938  Name: CHF.T_XIFENFEI                                    Type: TABLE               Undo recs: 99
              Used undo segment IDs: 6
Obj#:  51937  Name: SYS.T_XIFENFEI                                    Type: TABLE               Undo recs: 9
              Used undo segment IDs: 6
         File (validate_objects.sql) being created for object validattion already exists
         and will be overwritten!!!
         Do you want to continue overwriting [Y/N]?y
*** Undo segments (headers) that encountered errors preventing Active TX scan:
USN:  11  Name: _SYSSMU11$      File:   5  Block:       9  Instance:  0  Error: 27 - Undo segment was dropped
…………省略
USN:  20  Name: _SYSSMU20$      File:   5  Block:     153  Instance:  0  Error: 27 - Undo segment was dropped
WARNING: Analyzing the full database for active transactions will take some time!
         Are you sure you want to analyze the full database [Y/N]?
         Are you sure you want to analyze the full database [Y/N]?y
*** Scanning database for datablocks that may require undo (PLEASE WAIT...):
*** Asterisk ('*') denotes blocks being updated since 08/08/2012 03:30:32 (SCN: 0x0.0x79607)
Scanning datafile:  3 - /u01/oracle/oradata/XFF/sysaux01.dbf (SYSAUX)                           - Active TX blocks: 147 *
Scanning datafile:  1 - /u01/oracle/oradata/XFF/system01.dbf (SYSTEM)                           - Active TX blocks: 11597 *
--undo表空间跳过
Undo datafile (/u01/oracle/oradata/XFF/undotbs01.dbf) - SKIPPING
Scanning datafile:  4 - /u01/oracle/oradata/XFF/users01.dbf (USERS)                             - Active TX blocks: 2 *
--因为数据文件丢失控制文件中offline的跳过(其实只要数据文件丢失就会跳过)
Cannot access datafile (/u01/oracle/oradata/XFF/xifenfei01.dbf) (error 2 - No such file or directory) - SKIPPING
Cannot access datafile (/u01/oracle/oradata/XFF/xifenfei02.dbf) (error 2 - No such file or directory) - SKIPPING
Scanning datafile:  7 - /u01/oracle/oradata/XFF/xifenfei03.dbf (XIFENFEI2)                      - Active TX blocks: 0
*** Objects that may require undo data:
*** Asterisk ('*') denotes blocks being updated since 08/08/2012 03:30:32 (SCN: 0x0.0x79607)
DataObj#:  51938  Name: CHF.T_XIFENFEI                                    Type: TABLE               Datablocks: 2 *
Used undo segment IDs: 6
DataObj#:  46434  Name: MDSYS.SYS_IL0000046432C00006$$                    Type: INDEX               Datablocks: 4
Used undo segment IDs: 2
DataObj#:    124  Name: SYS.I_ACCESS1                                     Type: INDEX               Datablocks: 27
Used undo segment IDs: 1 2 3 4 5 6 7 8 9 10
…………省略
DataObj#:   8824  Name: SYS.SYS_IL0000008822C00008$$                      Type: INDEX               Datablocks: 1 *
Used undo segment IDs: 4
DataObj#:  51937  Name: SYS.T_XIFENFEI                                    Type: TABLE               Datablocks: 1 *
Used undo segment IDs: 6
DataObj#:  51557  Name: SYS.UTL_RECOMP_COMPILED                           Type: TABLE               Datablocks: 1
Used undo segment IDs: 8
…………省略
DataObj#:  42131  Name: XDB.XDB$ELEMENT_PROPNUMBER                        Type: INDEX               Datablocks: 1
Used undo segment IDs: 2
*** Use validate_objects.sql script file to validate the structure of possibly corrupt objects
    if the undo required is not available.
Undo Segment Usage Summary
**************************
*** Undo segments identified in use by active transaction datablocks AFTER 08/08/2012 03:30:32 (SCN: 0x0.0x79607):
USN:   1  Name: _SYSSMU1$
USN:   2  Name: _SYSSMU2$
USN:   3  Name: _SYSSMU3$
USN:   4  Name: _SYSSMU4$
USN:   5  Name: _SYSSMU5$
USN:   6  Name: _SYSSMU6$
USN:   9  Name: _SYSSMU9$
*** Undo segments identified in use by active transacation datablocks BEFORE 08/08/2012 03:30:32 (SCN: 0x0.0x79607):
USN:   1  Name: _SYSSMU1$
USN:   2  Name: _SYSSMU2$
USN:   3  Name: _SYSSMU3$
USN:   4  Name: _SYSSMU4$
USN:   5  Name: _SYSSMU5$
USN:   7  Name: _SYSSMU7$
USN:   8  Name: _SYSSMU8$
USN:   9  Name: _SYSSMU9$
USN:  10  Name: _SYSSMU10$
NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!
*************************************************************************************
Upload the logfile (TXChecker_083012_2042_XFF.log) to Oracle Support Services for analysis.
Do NOT attempt to force the database open until the logfile has been analyzed.

验证对象脚本

[oracle@xifenfei txchecker]$ more validate_objects.sql
rem validate_objects.sql - checks strcuture of objects needing unavailable undo data
rem                        Created by findtxns program, Oracle Corporation
set echo on
ANALYZE TABLE CHF.T_XIFENFEI VALIDATE STRUCTURE CASCADE;
ANALYZE INDEX MDSYS.SYS_IL0000046432C00006$$ VALIDATE STRUCTURE;
ANALYZE INDEX SYS.I_ACCESS1 VALIDATE STRUCTURE;
…………省略
ANALYZE INDEX XDB.SYS_C003167 VALIDATE STRUCTURE;
ANALYZE INDEX XDB.XDB$ELEMENT_PROPNAME VALIDATE STRUCTURE;
ANALYZE INDEX XDB.XDB$ELEMENT_PROPNUMBER VALIDATE STRUCTURE;

dbca创建数据库报ORA-00443

今天早上一个朋友和我说他们RAC dbca创建库不成功提示ORA-00443错误,让我帮他们分析下是什么原因导致
提示错误如图

环境状况

OS:LINUX REDHAT x86_64  5.7
kernel:2.6.18-194.el5
memory:100G
CPU:ntel(R) Xeon(R) CPU E7- 8837  @ 2.67GHz   *  64
ORACLE:10.2.0.4

查看alert日志错误

Wed Sep  5 01:32:33 2012
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
PMON started with pid=2, OS id=17859
DIAG started with pid=7, OS id=17861
PSP0 started with pid=12, OS id=17863
LMON started with pid=17, OS id=17865
LMD0 started with pid=22, OS id=17867
MMAN started with pid=27, OS id=17869
DBW0 started with pid=32, OS id=17871
Wed Sep  5 01:32:33 2012
Errors in file /u01/app/oracle/admin/dtjcdb/bdump/dtjcdb1_ora_17873.trc:
ORA-00600: internal error code, arguments: [ksbmoveme4], [], [], [], [], [], [], []
ORA-27300: OS system dependent operation:run on node failed with status: 2
ORA-27301: OS failure message: No such file or directory
ORA-27302: failure occurred at: skgpmoveme:1
Wed Sep  5 01:32:34 2012
Trace dumping is performing id=[cdmp_20120905013234]
Wed Sep  5 01:32:34 2012
Process DBW1 died, see its trace file
USER: terminating instance due to error 443
Instance terminated by USER, pid = 17857

trace文件中内容

*** 2012-09-05 01:32:33.996
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [ksbmoveme4], [], [], [], [], [], [], []
ORA-27300: OS system dependent operation:run on node failed with status: 2
ORA-27301: OS failure message: No such file or directory
ORA-27302: failure occurred at: skgpmoveme:1
Current SQL information unavailable - no session.
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+31          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FFFC4ABA4C0 ? 7FFFC4ABA520 ?
                                                   7FFFC4ABA460 ? 000000000 ?
ksedmp()+610         call     ksedst()             000000000 ? 000000001 ?
                                                   7FFFC4ABA4C0 ? 7FFFC4ABA520 ?
                                                   7FFFC4ABA460 ? 000000000 ?
ksfdmp()+21          call     ksedmp()             000000003 ? 000000001 ?
                                                   7FFFC4ABA4C0 ? 7FFFC4ABA520 ?
                                                   7FFFC4ABA460 ? 000000000 ?
kgerinv()+161        call     ksfdmp()             000000003 ? 000000001 ?
                                                   7FFFC4ABA4C0 ? 7FFFC4ABA520 ?
                                                   7FFFC4ABA460 ? 000000000 ?
kgesinv()+33         call     kgerinv()            0068966E0 ? 000000000 ?
                                                   7FFFC4ABA520 ? 7FFFC4ABA460 ?
                                                   000000000 ? 000000000 ?
ksesin()+211         call     kgesinv()            0068966E0 ? 000000000 ?
                                                   7FFFC4ABA520 ? 7FFFC4ABA460 ?
                                                   000000000 ? 000000000 ?
ksbmoveme()+350      call     ksesin()             00533D5C8 ? 000000000 ?
                                                   006896FA3 ? 000000001 ?
                                                   000000001 ? 000000000 ?
ksosp_set_current()  call     ksbmoveme()          000000001 ? 000000000 ?
+117                                               006896FA3 ? 000000001 ?
                                                   000000001 ? 000000000 ?
kso_init()+161       call     ksosp_set_current()  151056D10 ? 000000000 ?
                                                   006896FA3 ? 000000001 ?
                                                   000000001 ? 000000000 ?
opirip()+523         call     kso_init()           151056D10 ? 000000000 ?
                                                   006896FA3 ? 000000001 ?
                                                   000000001 ? 000000000 ?
opidrv()+582         call     opirip()             000000032 ? 000000004 ?
                                                   7FFFC4ABC128 ? 000000001 ?
                                                   000000001 ? 000000000 ?
sou2o()+114          call     opidrv()             000000032 ? 000000004 ?
                                                   7FFFC4ABC128 ? 000000001 ?
                                                   000000001 ? 000000000 ?
opimai_real()+317    call     sou2o()              7FFFC4ABC100 ? 000000032 ?
                                                   000000004 ? 7FFFC4ABC128 ?
                                                   000000001 ? 000000000 ?
main()+116           call     opimai_real()        000000003 ? 7FFFC4ABC190 ?
                                                   000000004 ? 7FFFC4ABC128 ?
                                                   000000001 ? 000000000 ?
__libc_start_main()  call     main()               000000003 ? 7FFFC4ABC190 ?
+244                                               000000004 ? 7FFFC4ABC128 ?
                                                   000000001 ? 000000000 ?
_start()+41          call     __libc_start_main()  000723088 ? 000000001 ?
                                                   7FFFC4ABC2E8 ? 000000000 ?
                                                   000000001 ? 000000003 ?
--------------------- Binary Stack Dump ---------------------

通过查询MOS发现[ID 422908.1]有类此的错误提示,但是该提示说是因为系统重新增加了过多CPU导致数据库crashed掉并且出现 ORA-27300 ORA-27301 ORA-27302 错误.在该案例中,起始就是64c,根据经验在win的10.2.0.4中如果cpu超过32c也是在dbcd创建数据库2%的地方hang住,所以怀疑该错误也是由于cpu太多导致.

处理方法

To solve the problem:
1) apply patch:6471079
- or -
2) apply the 10.2.0.5 (when available)
- or -
3) upgrade to 11g

朋友打上patch:6471079,dbca正常建库

_no_recovery_through_resetlogs参数功能探讨

_no_recovery_through_resetlogs参数默认值和描述

SQL> select a.ksppinm name,b.ksppstvl value,a.ksppdesc description
  2    from x$ksppi a,x$ksppcv b
  3   where a.inst_id = USERENV ('Instance')
  4     and b.inst_id = USERENV ('Instance')
  5     and a.indx = b.indx
  6     and upper(a.ksppinm) LIKE upper('%&param%')
  7  order by name
  8  /
Enter value for param: _no_recovery_through_resetlogs
old   6:    and upper(a.ksppinm) LIKE upper('%&param%')
new   6:    and upper(a.ksppinm) LIKE upper('%_no_recovery_through_resetlogs%')
NAME                             VALUE                    DESCRIPTION
-------------------------------- ------------------------ --------------------------------------------
_no_recovery_through_resetlogs   FALSE                    no recovery through this resetlogs operation

大家知道在10gr2版本及其以后版本,大家知道默认情况下,可以实现跨resetlogs恢复数据库.通过该参数的描述可以看出,该参数的用途是使得resetlogs之后不能继续进行恢复(我的理解是以前的备份不能应用resetlogs后的归档日志)
在实际中该函数的作用是否和该参数的描述相符,我们通过试验验证

rman备份数据库

[oracle@xifenfei tmp]$ rman target /
Recovery Manager: Release 10.2.0.4.0 - Production on Thu Aug 30 11:51:49 2012
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
connected to target database: XFF (DBID=3440302261)
RMAN> backup database format '/u01/oracle/oradata/tmp/10g_db_%U';
Starting backup at 30-AUG-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=143 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00001 name=/u01/oracle/oradata/XFF/system01.dbf
input datafile fno=00003 name=/u01/oracle/oradata/XFF/sysaux01.dbf
input datafile fno=00002 name=/u01/oracle/oradata/XFF/undotbs01.dbf
input datafile fno=00004 name=/u01/oracle/oradata/XFF/users01.dbf
channel ORA_DISK_1: starting piece 1 at 30-AUG-12
channel ORA_DISK_1: finished piece 1 at 30-AUG-12
piece handle=/u01/oracle/oradata/tmp/10g_db_0anjuhve_1_1 tag=TAG20120830T115214 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:01:35
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current control file in backupset
including current SPFILE in backupset
channel ORA_DISK_1: starting piece 1 at 30-AUG-12
channel ORA_DISK_1: finished piece 1 at 30-AUG-12
piece handle=/u01/oracle/oradata/tmp/10g_db_0bnjui2d_1_1 tag=TAG20120830T115214 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 30-AUG-12

resetlogs打开数据库

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             100665820 bytes
Database Buffers          209715200 bytes
Redo Buffers                7118848 bytes
Database mounted.
SQL> recover database until cancel;
Media recovery complete.
SQL> alter database open resetlogs;
Database altered.

创建测试表

SQL> create table t_xifenfei01
  2  as
  3  select * from dba_tables;
Table created.
SQL> create table t_xifenfei02
  2  as
  3  select * from dba_objects;
Table created.
SQL>  alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> create table t_xifenfei03
  2  as
  3  select * from dba_objects;
Table created.
SQL> alter system switch logfile;
System altered.
SQL> /
System altered.

恢复数据库

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
[oracle@xifenfei XFF]$ rm *.dbf
[oracle@xifenfei XFF]$ rman target /
Recovery Manager: Release 10.2.0.4.0 - Production on Thu Aug 30 12:00:47 2012
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
connected to target database (not started)
RMAN> startup mount
Oracle instance started
database mounted
Total System Global Area     318767104 bytes
Fixed Size                     1267236 bytes
Variable Size                100665820 bytes
Database Buffers             209715200 bytes
Redo Buffers                   7118848 bytes
RMAN> restore database;
Starting restore at 30-AUG-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=157 devtype=DISK
channel ORA_DISK_1: starting datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00001 to /u01/oracle/oradata/XFF/system01.dbf
restoring datafile 00002 to /u01/oracle/oradata/XFF/undotbs01.dbf
restoring datafile 00003 to /u01/oracle/oradata/XFF/sysaux01.dbf
restoring datafile 00004 to /u01/oracle/oradata/XFF/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/oracle/oradata/tmp/10g_db_0anjuhve_1_1
channel ORA_DISK_1: restored backup piece 1
piece handle=/u01/oracle/oradata/tmp/10g_db_0anjuhve_1_1 tag=TAG20120830T115214
channel ORA_DISK_1: restore complete, elapsed time: 00:00:55
Finished restore at 30-AUG-12
RMAN> recover database;
Starting recover at 30-AUG-12
using channel ORA_DISK_1
starting media recovery
archive log thread 1 sequence 7 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_7_790743352.dbf
archive log thread 1 sequence 8 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_8_790743352.dbf
archive log thread 1 sequence 9 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_9_790743352.dbf
archive log thread 1 sequence 10 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_10_790743352.dbf
archive log thread 1 sequence 1 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_1_792676624.dbf
archive log thread 1 sequence 2 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_2_792676624.dbf
archive log thread 1 sequence 3 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_3_792676624.dbf
archive log thread 1 sequence 4 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_4_792676624.dbf
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_7_790743352.dbf thread=1 sequence=7
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_8_790743352.dbf thread=1 sequence=8
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_9_790743352.dbf thread=1 sequence=9
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_10_790743352.dbf thread=1 sequence=10
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_1_792676624.dbf thread=1 sequence=1
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_2_792676624.dbf thread=1 sequence=2
media recovery complete, elapsed time: 00:00:06
Finished recover at 30-AUG-12
SQL> alter database open;
Database altered.
SQL> select table_name from user_tables where table_name like 'T_XIFENFEI0_';
TABLE_NAME
------------------------------
T_XIFENFEI01
T_XIFENFEI02
T_XIFENFEI03

证明10gr2确实可以跨resetlogs recover 日志恢复数据库

测试_no_recovery_through_resetlogs参数

SQL> create table t_xifenfei04 as
  2  select * from dba_objects;
Table created.
SQL> alter system set "_no_recovery_through_resetlogs"=true scope=spfile;
System altered.
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             100665820 bytes
Database Buffers          209715200 bytes
Redo Buffers                7118848 bytes
Database mounted.
SQL> recover database until cancel;
Media recovery complete.
SQL> alter database open resetlogs;
Database altered.
SQL> create table t_xifenfei05
  2  as
  3  select * from dba_objects;
Table created.
SQL> create table t_xifenfei06
  2  as
  3  select * from dba_objects;
Table created.
SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> /
System altered.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
[oracle@xifenfei XFF]$ rm *.dbf
[oracle@xifenfei XFF]$ rman target /
Recovery Manager: Release 10.2.0.4.0 - Production on Thu Aug 30 12:47:40 2012
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
connected to target database (not started)
RMAN> startup mount;
Oracle instance started
database mounted
Total System Global Area     318767104 bytes
Fixed Size                     1267236 bytes
Variable Size                100665820 bytes
Database Buffers             209715200 bytes
Redo Buffers                   7118848 bytes
RMAN> restore database;
Starting restore at 30-AUG-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=157 devtype=DISK
channel ORA_DISK_1: starting datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00001 to /u01/oracle/oradata/XFF/system01.dbf
restoring datafile 00002 to /u01/oracle/oradata/XFF/undotbs01.dbf
restoring datafile 00003 to /u01/oracle/oradata/XFF/sysaux01.dbf
restoring datafile 00004 to /u01/oracle/oradata/XFF/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/oracle/oradata/tmp/10g_db_0anjuhve_1_1
channel ORA_DISK_1: restored backup piece 1
piece handle=/u01/oracle/oradata/tmp/10g_db_0anjuhve_1_1 tag=TAG20120830T115214
channel ORA_DISK_1: restore complete, elapsed time: 00:00:45
Finished restore at 30-AUG-12
RMAN> recover database;
Starting recover at 30-AUG-12
using channel ORA_DISK_1
starting media recovery
archive log thread 1 sequence 7 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_7_790743352.dbf
archive log thread 1 sequence 8 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_8_790743352.dbf
archive log thread 1 sequence 9 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_9_790743352.dbf
archive log thread 1 sequence 10 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_10_790743352.dbf
archive log thread 1 sequence 1 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_1_792676624.dbf
archive log thread 1 sequence 2 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_2_792676624.dbf
archive log thread 1 sequence 3 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_3_792676624.dbf
archive log thread 1 sequence 4 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_4_792676624.dbf
archive log thread 1 sequence 5 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_5_792676624.dbf
archive log thread 1 sequence 6 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_6_792676624.dbf
archive log thread 1 sequence 1 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_1_792679299.dbf
archive log thread 1 sequence 2 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_2_792679299.dbf
archive log thread 1 sequence 3 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_3_792679299.dbf
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_7_790743352.dbf thread=1 sequence=7
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_8_790743352.dbf thread=1 sequence=8
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_9_790743352.dbf thread=1 sequence=9
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_10_790743352.dbf thread=1 sequence=10
--第一次 resetlogs后的归档
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_1_792676624.dbf thread=1 sequence=1
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_2_792676624.dbf thread=1 sequence=2
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_3_792676624.dbf thread=1 sequence=3
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_4_792676624.dbf thread=1 sequence=4
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_5_792676624.dbf thread=1 sequence=5
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_6_792676624.dbf thread=1 sequence=6
--第二次 resetlogs后的归档(设置了_no_recovery_through_resetlogs参数为true并resetlogs后的归档日志
archive log filename=/u01/oracle/oradata/XFF/archivelog/1_1_792679299.dbf thread=1 sequence=1
media recovery complete, elapsed time: 00:00:13
Finished recover at 30-AUG-12
RMAN> alter database open;
database opened
RMAN> exit
Recovery Manager complete.
[oracle@xifenfei XFF]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Thu Aug 30 12:49:46 2012
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select table_name from user_tables where table_name like 'T_XIFENFEI0_';
TABLE_NAME
------------------------------
T_XIFENFEI01
T_XIFENFEI02
T_XIFENFEI03
T_XIFENFEI04
T_XIFENFEI05
T_XIFENFEI06
6 rows selected.

通过这里的测试证明使用_no_recovery_through_resetlogs=true后,resetlogs之后还是可以正常可以recover相关日志,证明_no_recovery_through_resetlogs参数不是限制这里的resetlogs后的归档日志应用,至于该参数的具体用途也希望知道的朋友告知下。但是这个参数在clone db和从rac db恢复到单实例db的时候,可能因为bug原因需要设置该参数为true,如:

--rac恢复到单实例RMAN Duplicate from RAC backup fails ORA-38856 [ID 334899.1]
sql>alter open database resetlogs;
ORA-38856: cannot mark instance UNNAMED_INSTANCE_2 (redo thread 2) as enabled

控制文件异常导致ORA-00600[kccsbck_first]

今天接到一个朋友求救他们的his系统数据库不能访问,情况比较紧急,让我帮忙处理.登录数据库得到信息如下:

操作系统:windows 2003
数据库:8.1.7
容灾方案：双机+emc存储镜像
备份:数据库无任何备份
启动到mount报错类此:
ORA-00600: internal error code, arguments: [kccsbck_first], [1], [4141358753], [], [], [], [], []

这个问题在上周的数据库恢复中遇到过一次,他们也是因为双机的案例,当时的情况见:双机mount数据库出现ORA-00600[kccsbck_first],有了上次的思维,我开始也怀疑是客户的双机的问题,但是客户说双机在半年前就关闭了,没有启动过;因为我对win的双机不太熟悉,怕他们双机软自动启动系统然后接管oracle从而导致这个问题,然后让客户检查另一台机器，确定没有启动和接管oracle 服务.
然后查询MOS发现win上面的特殊之处:是控制文件corruption导致故障(不过dbv检查不出来),而且三个控制文件有同样的问题
MOS记录如下(不过8.1.7也存在同样问题) [ID 291684.1]

Applies to:
Oracle Server - Enterprise Edition - Version: 9.0.1.5 and later   [Release: 9.0.1 and later ]
Information in this document applies to any platform.
***Checked for relevance on 09-APR-2012***
Symptoms
Alter database mount exclusive results in
ORA-00600: internal error code, arguments: [kccsbck_first], [1], [2141358753], [], [], [], [], []
The description of the error is:
'We receive this error because we are attempting to be the first
thread/instance to mount the database and cannot because it appears that
at least one other thread has mounted the database already'.
However in this case the database was a standalone database on Windows.
It had only one oracle service running.
The operating system was rebooted, the oracle service was deleted and a new service created.
Even then the error persisted.
Cause
There was some corruption present in the controlfile.
Solution
In this case the problem was resolved by:
+ Taking a backup of the old control file
+ Recreating the control file using the following document
How to Recreate a Controlfile	 [Document 735106.1]

因为数据库不能mount,所以不能使用backup controlfile to trace;
因为是win系统,没有任何的控制文件备份,只能把控制文件拷贝到linux下面通过strings命令,自己编辑创建控制文件脚本(noresetlogs).执行脚本创建控制文件,recover database,应用redo文件恢复,然后resetlogs库,恢复成功(注意:8i中不需要另外增加临时文件)

bbed 文章汇总

bbed基本操作篇
bbed破坏数据文件
 bbed修改数据内容
 bbed恢复删除数据
 bbed修改ASM中数据
 bbed 找回被删除数据
 bbed 删除普通表记录
 通过bbed查看数据库结构
 bbed 删除 cluster table 记录
 bbed修改undo$(回滚段)状态
 利用bbed找回ORACLE更新前值

bbed晋级提升篇
bbed解决ORA-01190
bbed 修改datafile header
重现ORA-600[4000]异常
 bbed 使用实现 drop index 操作
 使用bbed修复损坏datafile header
使用bbed解决ORA-00600[2662]
bbed 恢复 GLOBAL_NAME 为空故障
 通过bbed解决ORA-00600[4000]案例
 ORA-00600 [ktbdchk1: bad dscn] 解决
 使用bbed让rac中的sysaux数据文件online
通过bbed修改回滚段状态解决ORA-00704故障
 通过bbed模拟ORA-00607/ORA-00600[4194]故障
 使用bbed解决ORA-00607/ORA-00600[4194]故障

bbed 删除普通表记录

有朋友和我说我的bbed系列缺少一篇普通表使用bbed删除记录的文章,月底现场保证回来没睡意,完善这篇文章,也算是对bbed系列的一个终结.
创建模拟记录

SQL> create table t_xifenfei tablespace users
  2  as
  3  select * from dba_tables where rownum<10;
Table created.
SQL> alter system checkpoint;
System altered.
SQL> select   table_name,owner,rowid,
  2  dbms_rowid.rowid_relative_fno(rowid)rel_fno,
  3  dbms_rowid.rowid_block_number(rowid)blockno,
  4  dbms_rowid.rowid_row_number(rowid) rowno
  5  from t_xifenfei;
TABLE_NAME                     OWNER                          ROWID                 REL_FNO    BLOCKNO      ROWNO
------------------------------ ------------------------------ ------------------ ---------- ---------- ----------
CON$                           SYS                            AAAM9UAAEAAACA0AAA          4       8244          0
UNDO$                          SYS                            AAAM9UAAEAAACA0AAB          4       8244          1
CDEF$                          SYS                            AAAM9UAAEAAACA0AAC          4       8244          2
CCOL$                          SYS                            AAAM9UAAEAAACA0AAD          4       8244          3
PROXY_ROLE_DATA$               SYS                            AAAM9UAAEAAACA0AAE          4       8244          4
FILE$                          SYS                            AAAM9UAAEAAACA0AAF          4       8244          5
FET$                           SYS                            AAAM9UAAEAAACA0AAG          4       8244          6
TS$                            SYS                            AAAM9UAAEAAACA0AAH          4       8244          7
PROXY_DATA$                    SYS                            AAAM9UAAEAAACA0AAI          4       8244          8
9 rows selected.
SQL> select dump('FILE$',16) from dual;
DUMP('FILE$',16)
----------------------------
Typ=96 Len=5: 46,49,4c,45,24
SQL> select dump('SYS',16) FROM DUAL;
DUMP('SYS',16)
----------------------
Typ=96 Len=3: 53,59,53
SQL> SHUTDOWN IMMEDIATE
Database closed.
Database dismounted.
ORACLE instance shut down.

这里创建一张测试表,有9条记录,计划使用bbed删除file$的记录

bbed删除表记录

[oracle@xifenfei ~]$ bbed listfile=bbedfile
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Sat Sep 1 10:28:57 2012
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> info
 File#  Name                                                        Size(blks)
 -----  ----                                                        ----------
     1  /u01/oradata/orcl/system01.dbf                                       0
     3  /u01/oradata/orcl/sysaux01.dbf                                       0
     4  /u01/oradata/orcl/users01.dbf                                        0
     5  /u01/oradata/orcl/GS_ORADB_001.dbf                                   0
     6  /u01/oradata/orcl/GS_ORADB_IDX_001.dbf                               0
     7  /u01/oradata/orcl/undo01.dbf                                         0
BBED> set file 4 block 8244
        FILE#           4
        BLOCK#          8244
BBED> map
 File: /u01/oradata/orcl/users01.dbf (4)
 Block: 8244                                  Dba:0x01002034
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 96 bytes                     @20
 struct kdbh, 14 bytes                      @124
 struct kdbt[1], 4 bytes                    @138
 sb2 kdbr[9]                                @142
 ub1 freespace[6137]                        @160
 ub1 rowdata[1891]                          @6297
 ub4 tailchk                                @8188
BBED> set count 32
        COUNT           32
--查找对应值,估算起位置
BBED>  find /x 494c4524
 File: /u01/oradata/orcl/users01.dbf (4)
 Block: 8244             Offsets: 6929 to 6960           Dba:0x01002034
------------------------------------------------------------------------
 494c4524 06535953 54454dff ff055641 4c494402 c10b02c1 2902c102 03c20338
 <32 bytes per line>
BBED> p *kdbr[7]
rowdata[209]
------------
ub1 rowdata[209]                            @6506     0x2c
--6506肯定不是在这个位置
BBED> p *kdbr[5]
rowdata[623]
------------
ub1 rowdata[623]                            @6920     0x2c
--6920包含了6929,可以确定在该位置
--查看对应值
BBED>  x /rccc
rowdata[623]                                @6920
------------
flag@6920: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@6921: 0x00
cols@6922:   49
col    0[3] @6923: SYS
col    1[5] @6927: FILE$
col    2[6] @6933: SYSTEM
col    3[0] @6940: *NULL*
col    4[0] @6941: *NULL*
col    5[5] @6942: VALID
col    6[2] @6948: ..
col    7[2] @6951: .)
col    8[2] @6954: ..
col    9[3] @6957: ..8
col   10[4] @6961: ..8%
col   11[0] @6966: *NULL*
col   12[2] @6967: ..
col   13[6] @6970: ..01%.
col   14[0] @6977: *NULL*
col   15[2] @6978: ..
col   16[2] @6981: ..
col   17[3] @6984: YES
col   18[1] @6988: N
col   19[2] @6990: ..
col   20[2] @6993: ..
col   21[1] @6996: .
col   22[1] @6998: .
col   23[1] @7000: .
col   24[2] @7002: .(
col   25[1] @7005: .
col   26[1] @7007: .
col  27[10] @7009:          1
col  28[10] @7020:          1
col   29[5] @7031:     N
col   30[7] @7037: ENABLED
col   31[2] @7045: ..
col   32[7] @7048: xp....!
col   33[2] @7056: NO
col   34[0] @7059: *NULL*
col   35[1] @7060: N
col   36[1] @7062: N
col   37[2] @7064: NO
col   38[7] @7067: DEFAULT
col   39[8] @7075: DISABLED
col   40[3] @7084: YES
col   41[2] @7088: NO
col   42[0] @7091: *NULL*
col   43[8] @7092: DISABLED
col   44[3] @7101: YES
col   45[0] @7105: *NULL*
col   46[8] @7106: DISABLED
col   47[8] @7115: DISABLED
col   48[2] @7124: NO
BBED> d
 File: /u01/oradata/orcl/users01.dbf (4)
 Block: 8244             Offsets: 6920 to 6951           Dba:0x01002034
------------------------------------------------------------------------
 2c003103 53595305 46494c45 24065359 5354454d ffff0556 414c4944 02c10b02
 <32 bytes per line>
BBED> set mode edit
        MODE            Edit
--修改为delete状态
BBED> m /x 3c
 File: /u01/oradata/orcl/users01.dbf (4)
 Block: 8244             Offsets: 6920 to 6951           Dba:0x01002034
------------------------------------------------------------------------
 3c003103 53595305 46494c45 24065359 5354454d ffff0556 414c4944 02c10b02
 <32 bytes per line>
BBED> sum apply
Check value for File 4, Block 8244:
current = 0xa274, required = 0xa274
--验证不通过,因为空闲空间不正确(删除了数据还是以前的值当然不正确)
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oradata/orcl/users01.dbf
BLOCK = 8244
Block Checking: DBA = 16785460, Block Type = KTB-managed data block
data header at 0x7f0a75d0327c
kdbchk: the amount of space used is not equal to block size
        used=1722 fsc=0 avsp=6137 dtl=8064
Block 8244 failed with check code 6110
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 1
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
BBED> p kdbh
struct kdbh, 14 bytes                       @124
   ub1 kdbhflag                             @124      0x00 (NONE)
   b1 kdbhntab                              @125      1
   b2 kdbhnrow                              @126      9
   sb2 kdbhfrre                             @128     -1
   sb2 kdbhfsbo                             @130      36
   sb2 kdbhfseo                             @132      6173
   b2 kdbhavsp                              @134      6137
   b2 kdbhtosp                              @136      6137
BBED> m /x c618 offset 134
 File: /u01/oradata/orcl/users01.dbf (4)
 Block: 8244             Offsets:  134 to  165           Dba:0x01002034
------------------------------------------------------------------------
 c618f917 00000900 b01ee11d 0a1d311c 5b1b8c1a be19ee18 1d180000 00000000
 <32 bytes per line>
BBED> m /x c618 offset 136
 File: /u01/oradata/orcl/users01.dbf (4)
 Block: 8244             Offsets:  136 to  167           Dba:0x01002034
------------------------------------------------------------------------
 c6180000 0900b01e e11d0a1d 311c5b1b 8c1abe19 ee181d18 00000000 00000000
 <32 bytes per line>
BBED> sum apply
Check value for File 4, Block 8244:
current = 0xa274, required = 0xa274
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oradata/orcl/users01.dbf
BLOCK = 8244
Block Checking: DBA = 16785460, Block Type = KTB-managed data block
data header at 0x13ef07c
kdbchk: space available on commit is incorrect
        tosp=6342 fsc=0 stb=2 avsp=6342
Block 8244 failed with check code 6111
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 1
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
BBED> m /x c8 offset 136
 File: /u01/oradata/orcl/users01.dbf (4)
 Block: 8244             Offsets:  136 to  167           Dba:0x01002034
------------------------------------------------------------------------
 c8180000 0900b01e e11d0a1d 311c5b1b 8c1abe19 ee181d18 00000000 00000000
 <32 bytes per line>
BBED> p kdbh
struct kdbh, 14 bytes                       @124
   ub1 kdbhflag                             @124      0x00 (NONE)
   b1 kdbhntab                              @125      1
   b2 kdbhnrow                              @126      9
   sb2 kdbhfrre                             @128     -1
   sb2 kdbhfsbo                             @130      36
   sb2 kdbhfseo                             @132      6173
   b2 kdbhavsp                              @134      6342
   b2 kdbhtosp                              @136      6344
BBED> sum apply
Check value for File 4, Block 8244:
current = 0xa27a, required = 0xa27a
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oradata/orcl/users01.dbf
BLOCK = 8244
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0

启动数据库测试

SQL> startup
ORACLE instance started.
Total System Global Area  167772160 bytes
Fixed Size                  2019320 bytes
Variable Size              75497480 bytes
Database Buffers           88080384 bytes
Redo Buffers                2174976 bytes
Database mounted.
Database opened.
SQL> set lines 150
SQL> select   table_name,owner,rowid,
  2   dbms_rowid.rowid_relative_fno(rowid)rel_fno,
  3   dbms_rowid.rowid_block_number(rowid)blockno,
  4   dbms_rowid.rowid_row_number(rowid) rowno
  5   from t_xifenfei;
TABLE_NAME                     OWNER                          ROWID                 REL_FNO    BLOCKNO      ROWNO
------------------------------ ------------------------------ ------------------ ---------- ---------- ----------
CON$                           SYS                            AAAM9UAAEAAACA0AAA          4       8244          0
UNDO$                          SYS                            AAAM9UAAEAAACA0AAB          4       8244          1
CDEF$                          SYS                            AAAM9UAAEAAACA0AAC          4       8244          2
CCOL$                          SYS                            AAAM9UAAEAAACA0AAD          4       8244          3
PROXY_ROLE_DATA$               SYS                            AAAM9UAAEAAACA0AAE          4       8244          4
FET$                           SYS                            AAAM9UAAEAAACA0AAG          4       8244          6
TS$                            SYS                            AAAM9UAAEAAACA0AAH          4       8244          7
PROXY_DATA$                    SYS                            AAAM9UAAEAAACA0AAI          4       8244          8
8 rows selected.

可以看到file$这条记录已经被删除,证明bbed操作普通表删除成功
相关文章:
1.bbed 删除 cluster table 记录
2.bbed 找回被删除数据
3.利用bbed找回ORACLE更新前值

ORA-607/ORA-600[4194]不一定是重大灾难

以前解决过ORA-607/ORA-600[4194]和模拟过ORA-607/ORA-600[4194]错误,所以固定思维任务ORA-607/ORA-600[4194]可能就是重大灾难,通过这个案例来说明ORA-607/ORA-600[4194]可能也就是一个常规的不能再常规的错误:有一网友数据库因意外关闭电源导致启动过程出现ORA-00607/ORA-00600[4194]/ORA-00600[4097]的错误,使得数据库启动失败.

SMON: enabling tx recovery
Fri Aug 31 23:14:08 2012
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=19, OS id=15619
Fri Aug 31 23:14:10 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Fri Aug 31 23:14:12 2012
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Fri Aug 31 23:14:12 2012
Completed: alter database open
Fri Aug 31 23:14:14 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-01595: error freeing extent (2) of rollback segment (4))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], []
Fri Aug 31 23:29:41 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [17], [10], [], [], [], [], []
Fri Aug 31 23:29:43 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], []
Fri Aug 31 23:29:44 2012
Errors in file /u01/oradata/orcl/bdump/orcl_pmon_15577.trc:
ORA-00474: SMON process terminated with error
Fri Aug 31 23:29:44 2012
PMON: terminating instance due to error 474
Instance terminated by PMON, pid = 15577

通过alert日志可以定位到SMON_SCN_TIME表或者其回滚操作可能异常,结合alert和trace分析,发现这次错误的操作主要sql语句为:

ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], []
Current SQL statement for this session:
UPDATE SYS.COL_USAGE$
   SET EQUALITY_PREDS    = EQUALITY_PREDS +
                           DECODE(BITAND(:FLAG, 1), 0, 0, 1),
       EQUIJOIN_PREDS    = EQUIJOIN_PREDS +
                           DECODE(BITAND(:FLAG, 2), 0, 0, 1),
       NONEQUIJOIN_PREDS = NONEQUIJOIN_PREDS +
                           DECODE(BITAND(:FLAG, 4), 0, 0, 1),
       RANGE_PREDS       = RANGE_PREDS + DECODE(BITAND(:FLAG, 8), 0, 0, 1),
       LIKE_PREDS        = LIKE_PREDS + DECODE(BITAND(:FLAG, 16), 0, 0, 1),
       NULL_PREDS        = NULL_PREDS + DECODE(BITAND(:FLAG, 32), 0, 0, 1),
       TIMESTAMP         = :TIME
 WHERE OBJ# = :OBJN
   AND INTCOL# = :COLN
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4194], [17], [10], [], [], [], [], []
Current SQL statement for this session:
UPDATE SYS.MON_MODS$
   SET INSERTS       = INSERTS + :INS,
       UPDATES       = UPDATES + :UPD,
       DELETES       = DELETES + :DEL,
       FLAGS        =
       (DECODE(BITAND(FLAGS, :FLAG), :FLAG, FLAGS, FLAGS + :FLAG)),
       DROP_SEGMENTS = DROP_SEGMENTS + :DROPSEG,
       TIMESTAMP     = :TIME
 WHERE OBJ# = :OBJN
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Current SQL statement for this session:
INSERT INTO SMON_SCN_TIME
  (THREAD,
   TIME_MP,
   TIME_DP,
   SCN,
   SCN_WRP,
   SCN_BAS,
   NUM_MAPPINGS,
   TIM_SCN_MAP)
VALUES
  (0, :1, :2, :3, :4, :5, :6, :7)

这里主要涉及到对oracle的三张表的操作
COL_USAGE$:主要是在收集统计信息的时候作为是否需要收集列直方图信息参考
MON_MODS$:Oracle主要利用该表来记录那些表的数据发生改变,方便收集统计信息
SMON_SCN_TIME:记录SCN和TIME的对应关系
通过这里的分析可以确定这三张表中的数据对于数据库来说不是致命的基表信息,在数据库运行过程中可以清理掉这些信息,最多就是因为数据库性能的下降或者SCN和TIME互转功能不完善.

解决思路
完整的undo异常处理顺序
1.从alert中可以看出来数据库是在open之后由于SMON回滚到上述几条sql异常导致数据库down,所以可以尝试使用system回滚段启动数据库,看看是否可以屏蔽相关问题
2.如果方法1不可行,那使用event屏蔽smon对回滚段的相关操作,使得数据库正常启动
3.如果由于存在特殊事务,event无法屏蔽,尝试使用隐含参数处理该问题
4.如果隐含参数尚无法解决给问题考虑使用bbed
5.如果bbed不能解决,那只能选择dul或者其类似工具处理
这个案例中我们明确的看到是因为上面的三条sql回滚异常出现问题导致,对于这样的问题,经过测试使用方法1和2都能够顺利解决问题(open库之后需要重建undo,删除有问题undo表空间,修改参数[可能包括event],切换undo表空间).因为遇到几次ORA-607/ORA-600[4194]是因为system rollback损坏导致,所以这次开始也认为是一次比较复杂的恢复,最后证明这次是一种非常常规的恢复.对于ORACLE的数据库恢复有经验可能会比较快的定位问题,但是如果按照固定的思路去想可能会让自己走进死胡同.

extundelete恢复Linux被删除文件

今天群中有个朋友的客户闲着没事rm掉了数据文件,然后讨论到使用extundelete进行了恢复,抢救了部分未被覆盖的数据文件。该软件官方地址:http://extundelete.sourceforge.net/
安装extundelete

[root@xifenfei tmp]# bunzip2 extundelete-0.2.0.tar.bz2
[root@xifenfei tmp]# tar xvf extundelete-0.2.0.tar
extundelete-0.2.0/
extundelete-0.2.0/README
extundelete-0.2.0/acinclude.m4
extundelete-0.2.0/configure.ac
extundelete-0.2.0/aclocal.m4
extundelete-0.2.0/Makefile.am
extundelete-0.2.0/Makefile.in
extundelete-0.2.0/config.h.in
extundelete-0.2.0/configure
extundelete-0.2.0/compile
extundelete-0.2.0/depcomp
extundelete-0.2.0/install-sh
extundelete-0.2.0/missing
extundelete-0.2.0/LICENSE
extundelete-0.2.0/autogen.sh
extundelete-0.2.0/src/
extundelete-0.2.0/src/Makefile.am
extundelete-0.2.0/src/Makefile.in
extundelete-0.2.0/src/extundelete.cc
extundelete-0.2.0/src/block.c
extundelete-0.2.0/src/insertionops.cc
extundelete-0.2.0/src/block.h
extundelete-0.2.0/src/extundelete.h
extundelete-0.2.0/src/extundelete-priv.h
extundelete-0.2.0/src/jfs_compat.h
extundelete-0.2.0/src/kernel-jbd.h
[root@xifenfei tmp]# cd extundelete-0.2.0
[root@xifenfei extundelete-0.2.0]#  ./configure
Configuring extundelete 0.2.0
Writing generated files to disk
[root@xifenfei extundelete-0.2.0]# make && make install
make -s all-recursive
Making all in src
Making install in src
  /usr/bin/install -c 'extundelete' '/usr/local/bin/extundelete'

恢复操作命令

--umount或者read only 分区(如果umount不成功,可以通过fuser -km /www类似方法找出来占用进程）
 umount /dev/partition
 mount -o remount,ro /dev/partition
--恢复文件
extundelete /dev/partition --restore-all
extundelete /dev/partition --restore-directory /backup/gnutool-delete
extundelete /dev/partition --restore-files /etc/passwd

测试恢复

--基于目录恢复
[root@xifenfei tmp]#extundelete /dev/sdb1  --restore-directory /u01/xifenfei
Loading filesystem metadata ... 160 groups loaded.
Loading journal descriptors ... 23 descriptors loaded.
Writing output to directory RECOVERED_FILES/
Failed to restore file /u01/xifenfei
Could not find correct inode number past inode 2.
[root@xifenfei tmp]# cd RECOVERED_FILES/
[root@xifenfei RECOVERED_FILES]# ll
total 0
--未成功
--基于磁盘恢复
[root@xifenfei RECOVERED_FILES]#extundelete /dev/sdb1  --restore-all
Loading filesystem metadata ... 160 groups loaded.
Loading journal descriptors ... 23 descriptors loaded.
Writing output to directory RECOVERED_FILES/
Searching for recoverable inodes in directory / ...
4 recoverable inodes found.
Looking through the directory structure for deleted files ...
Failed to restore inode 1966081 to file RECOVERED_FILES/xifenfei:Inode does not correspond to a regular file.
Restored inode 1966082 to file RECOVERED_FILES/xifenfei/xff.txt
Failed to restore inode 1966083 to file RECOVERED_FILES/xifenfei/xff:Inode does not correspond to a regular file.
Restored inode 1966084 to file RECOVERED_FILES/xifenfei/xff/xff.txt132
0 recoverable inodes still lost.
[root@xifenfei RECOVERED_FILES]# ll
total 4
drwxr-xr-x 3 root root 4096 Aug 31 21:36 RECOVERED_FILES
[root@xifenfei RECOVERED_FILES]# cd RECOVERED_FILES/
[root@xifenfei RECOVERED_FILES]# ll
total 4
drwxr-xr-x 3 root root 4096 Aug 31 21:36 xifenfei
[root@xifenfei RECOVERED_FILES]# cd xifenfei/
[root@xifenfei xifenfei]# ll
total 20
drwxr-xr-x 2 root root  4096 Aug 31 21:36 xff
-rw-r--r-- 1 root root 13231 Aug 31 21:36 xff.txt
[root@xifenfei xifenfei]# cd xff
[root@xifenfei xff]# ll
total 16
-rw-r--r-- 1 root root 13231 Aug 31 21:36 xff.txt132
--恢复成功
--基于文件恢复
[root@xifenfei xff]# extundelete /dev/sdb1  --restore-files /u01/xifenfei/xff.txt
Loading filesystem metadata ... 160 groups loaded.
Loading journal descriptors ... 23 descriptors loaded.
Writing output to directory RECOVERED_FILES/
[root@xifenfei xff]# cd RECOVERED_FILES/
[root@xifenfei RECOVERED_FILES]# ll
total 0
--未成功

通过上述实验证明extundelete还是有很大的不完整性,基于整个磁盘的恢复功能较为强大,基于目录和文件的恢复还不够强大.不过该软件在很多时候还是有救火的功能,特别是当蛋筒的人员删除了数据库的部分文件之时.