Oracle备份恢复 – 第27页

ORA-600 4193 错误说明和解决

ORA-600 4193 解释说明

ERROR:
  Format: ORA-600 [4193] [a] [b]
VERSIONS:
  versions 6.0 to 12.1
DESCRIPTION:
  A mismatch has been detected between Redo records and Rollback (Undo)
  records.
  We are validating the Undo block sequence number in the undo block against
  the Redo block sequence number relating to the change being applied.
  This error is reported when this validation fails.
ARGUMENTS:
  Arg [a] Undo record seq number
  Arg [b] Redo record seq number
FUNCTIONALITY:
  KERNEL TRANSACTION UNDO
ORA-600 [4193] [a] [b] [ ] [ ]  [ ]
Versions: 7.2.2  - 9.2.0                              Source: ktuc.c
===========================================================================
Meaning: seq# mismatch while adding an undo record to an undo block. This
         is done by the application of redo.
---------------------------------------------------------------------------
Argument Description:
    a. (ktubhseq): undo record seq# - this is the seq# of the block that
                                      this undo record WILL BE APPLIED TO.
                                      This is from the Undo Block. It is
                                      NOT the seq# of the undo block itself.
    b. (ktudbseq): redo RECORD seq# - this is the seq# number in the block
                                      that this redo WILL BE APPLIED TO.
                                      This is from the Redo Record.
---------------------------------------------------------------------------
Diagnosis:
    This error is raised in kturdb which handles the adding of undo records
    by the application of redo.
    When we try to apply redo to an undo block (forward changes are made by
    the application of redo to a block) we check that the seq# in the undo
    record matches the seq# in the redo record. These seq# should be the
    same because when we apply a redo record we must apply it to the
    correct version of the block. We can only apply a redo record to a
    block that contains the same seq# as in the redo record.
    If the seq# do not match then this error is raised. This implies some
    kind of block corruption in either the redo or the undo block.
7.3.x - 8.1.7.x
ASSERT2(ubh->ktubhseq == db->ktudbseq, OERI(4193), KSESVSGN,
            ubh->ktubhseq, db->ktudbseq);
9.2.x
ksesic2(OERI(4193), ksenrg(ubh->ktubhseq), ksenrg(db->ktudbseq));
struct ktubh
{
  kxid  ktubhxid;      /* txid of tx currently using or last used this block */
  ub2   ktubhseq;                              /* undo block sequence number */
  ub1   ktubhcnt;    /* high water mark record index, number of undo entries */
  ub1   ktubhirb;  /* rollback record index, rec index to start the rollback */
  ub1   ktubhicl;  /* collecting record index, rec index to start retrieving col info */
  ub1   ktubhflg;                                                 /* dummy */
  ub2   ktubhidx[1];     /* byte offset of record in block, grows at runtime */
};
struct ktudb   Kernel Transaction Undo Data operation Block (redo)
{
  ub2    ktudbsiz;                                          /* size of entry */
  ub2    ktudbspc;                 /* verification: space left in undo block */
  ub2    ktudbflg;            /* flag to indicate the kind of redo operation */
  kxid   ktudbxid;                                          /* current tx id */
  ub2    ktudbseq;                                  /* block sequence number */
  ub1    ktudbrec;                       /* new record index for this change */
};

ORA 600 4193 处理方法同How to resolve ORA-600 [4194] errors

How to resolve ORA-600 [4194] errors

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：How to resolve ORA-600 [4194] errors

在oracle恢复中ORA-600 4194是一个非常常见的错误,该错误的主要原因是由于redo记录和undo(rollback)记录不匹配.
ORA 600 4194错误原因以及含义

ERROR:
  Format: ORA-600 [4194] [a] [b]
VERSIONS:
  versions 6.0 to 12.1
DESCRIPTION:
  A mismatch has been detected between Redo records and rollback (Undo)
  records.
  We are validating the Undo record number relating to the change being
  applied against the maximum undo record number recorded in the undo block.
  This error is reported when the validation fails.
ARGUMENTS:
  Arg [a] Maximum Undo record number in Undo block
  Arg [b] Undo record number from Redo block

ORA 600 4194 错误处理思路
第一步

Confirm whether the database is up and running or not.  If the database fails to start or crashes shortly
after startup due to this error occurring, then try setting event 10513 at level 2 in the init.ora/spfile
to disable transaction recovery and restart the instance, e.g.:
      event = "10513 trace name context forever, level 2"
This may allow the database to successfully open and stay up so that
the required diagnostics/actions can be performed.

第二步

In the trace file there should be an undo segment header dump, and so check
to see if the undo segment header shows an active transaction after recovery, e.g.:
TRN TBL    <---- Represents the Transaction table for the particular undo segment
index state cflags wrap# uel scn dba
---------------------------------------------------------------------------------------------
0x41 9 0x80 0x35ab6 0xffff 0x0695.38f6b959 0x1081e796
0x42 9 0x80 0x35bb1 0x000e 0x0695.38f6b028 0x1081e793
0x43 9 0x80 0x35b11 0x005d 0x0695.38f6b7ae 0x1081e795
0x44 9 0x80 0x359f0 0x0036 0x0695.38f69a91 0x1081e78e
0x45 10 0x80 0x35b1b 0x0000 0x0695.3a0aba4d 0x1081e796
0x46 9 0x80 0x35bb7 0x001c 0x0695.38f69bde 0x1081e78f
===================================
State ---> This column specifies the status of the transaction
                  9 -----> represents a commited transaction
                  10 ---> Represents a active transaction
Dba -----> Undo block containing the undo records
                  Strictly speaking this is the block at the end of the undo chain.
You can see from the transaction table that there is an active transaction
for this particular rollback/undo segment after recovery.
Therefore this rollback/undo segment and/or undo tablespace cannot be dropped without corrupting the database!
Therefore recreating the UNDO tablespace is not an option.

第三步

From the trace file determine the affected undo segment, e.g.:
Block image after block recovery:
UNDO BLK:
xid: 0x0015.02b.0001544b seq: 0x163e cnt: 0x12 irb: 0x12 icl: 0x0 flg: 0x0000
XID ==> Undo segment no + Slot no + Sequence no
Therefore, in this case the Undo Segment is:
USN# 0x15 (Hex) ==> 21 (Dec)  ==> _SYSSMU21$
So if and ONLY IF the transaction table shows no active transaction can the
 rollback/undo segment be offlined and dropped.Note however,
that before you can confirm if the entire UNDO tablespace can be dropped, you would need to check the
transaction tables of ALL active rollback/undo segments in the same wasy as the above.
The steps required to drop the rollback/undo segment are fully detailed in Note:179952.1,
but are briefly listed here for completeness:
If using Automatic Undo Management
Offline the undo segment using the _OFFLINE_ROLLBACK_SEGMENTS parameter and bounce the database as follows:
1.  Create  and edit the init.ora file for the instance to set the following parameters:
UNDO_MANAGEMENT=MANUAL
_OFFLINE_ROLLBACK_SEGMENTS=(_SYSSMU21$)
2.  Open the database in restricted mode to prevent user access, e.g.:
connect / as sysdba
startup restrict pfile = '<Full path to init.ora file>';
3.  Drop the rollback/undo segment, e.g.:
drop rollback segment "_SYSSMU21";
4.  Shutdown the instance, and remove the init.ora parameters added in point 1 and restart the instance, e.g.:
shutdown immediate
startup
If SMON was recovering the transaction then this may not work as we cannot open the database if it is determined
to be in an inconsistent state. I have reviewed a number of SRs where this approach was successful,
so it is important to try it first but understand that it may fail and you will have to resort to
a point in time recovery or forcing open the DB and recreating it.

第四步

Now we need to dump the undo block to see which object was affected.
We noted in Step 2 that this is the active transaction (from the trace file):
TRN TBL
index state cflags wrap# uel scn dba
0x45 10 0x80 0x35b1b 0x0000 0x0695.3a0aba4d 0x1081e796
Dba----------------> Undo block containing the undo records
dba--->0x1081e796 is the block containing the active transaction .
Use the WebIV tools to convert this RDBA to block number (block#) and file number (file#), e.g.:
V SPLIT ==> DBA (Hex) = File#,Block# (Hex File#,Block#)
= ===== === ===== ============
V8 10,10 ==> 276948886 (0x1081e796) = 66,124822 (0x42 0x1e796)
So the file# is 66 and the block# is 124822, so dump the block by issuing:
SQL> Alter system dump datafile 66 block 124822;
This will generate a trace file in the user_dump_dest.  The following is a sample of the information in the undo block:
UNDO BLK:
xid: 0x000c.045.00035b1b seq: 0x1e14 cnt: 0x17 irb: 0x17 icl: 0x0 flg: 0x0000
Rec Offset Rec Offset Rec Offset Rec Offset Rec Offset
---------------------------------------------------------------------------
0x01 0x1f8c 0x02 0x1f30 0x03 0x1ed4 0x04 0x1e78 0x05 0x1e1c
0x06 0x1dc0 0x07 0x1d64 0x08 0x1d08 0x09 0x1cac 0x0a 0x1c50
0x0b 0x1bf4 0x0c 0x1b98 0x0d 0x1b3c 0x0e 0x1ae0 0x0f 0x1a74
0x10 0x1a18 0x11 0x19bc 0x12 0x1960 0x13 0x1904 0x14 0x187c
0x15 0x181c 0x16 0x1798 0x17 0x173c
* Rec #0x16 slt: 0x45 objn: 1485619(0x0016ab33) objd: 1485619 tblspc: 71(0x00000047)
* Layer: 11 (Row) opc: 1 rci 0x00
Undo type: Regular undo Begin trans Last buffer split: No
Temp Object: No
Tablespace Undo: No
rdba: 0x00000000
*-----------------------------
uba: 0x1081e796.1e14.14 ctl max scn: 0x0695.38f69853 prv tx scn: 0x0695.38f698a1
KDO undo record:
KTB Redo
op: 0x04 ver: 0x01
op: L itl: scn: 0x0019.009.00034237 uba: 0x36c0cce4.1d2f.19
flg: C--- lkc: 0 scn: 0x0695.38f6b96b
KDO Op code: URP xtype: XA bdba: 0x35406893 hdba: 0x35406892
itli: 1 ispac: 0 maxfr: 4863
tabn: 0 slot: 0(0x0) flag: 0x2c lock: 0 ckix: 0
ncol: 1 nnew: 1 size: -1
col 0: [ 4] c3 0e 36 2e
*-----------------------------
* Rec #0x17 slt: 0x45 objn: 1485619(0x0016ab33) objd: 1485619 tblspc: 71(0x00000047)
* Layer: 11 (Row) opc: 1 rci 0x16
Undo type: Regular undo Last buffer split: No
Temp Object: No
Tablespace Undo: No
rdba: 0x00000000
*-----------------------------
From the trace file above:
UNDO BLK:
xid: 0x000c.045.00035b1b seq: 0x1e14 cnt: 0x17 irb: 0x17 icl: 0x0 flg: 0x0000
The undo segment with the active transaction is segment is 0x000c (Hex) which is 12 (Dec) as the XID is:
      Undo segment no + Slot no + Sequence no
This step is often skipped because it was performed earlier in step 3, however it is a good idea to do this
again now to make sure that the XID from the UNDO block matches the UNDO SEGMENT HEADER,
this way you have followed all the chain, from the UNDO SEGMENT to UNDO BLOCK, back and forth.
If there is a conflict here please check and make sure that the customer dumped the correct undo block.
Check for the value of irb which is an index which points you to the latest change done to the undo block.
This is the point from which a rollback would begin if one was issued.
From the trace file we see: 'irb: 0x17' so this points to record 0x17,
so search for this particular string i.e 0x17 and it will take you to undo record 'REC #0x17', e.g.:
* Rec #0x17 slt: 0x45 objn: 1485619(0x0016ab33) objd: 1485619 tblspc: 71(0x00000047)
* Layer: 11 (Row) opc: 1 rci 0x16
Undo type: Regular undo Last buffer split: No
Temp Object: No
Tablespace Undo: No
rdba: 0x00000000
*-----------------------------
Note the slot number (slt) is 0x45, the object number (objn) is the OBJECT_ID from dba_objects
and data object number (objd) is the DATA_OBJECT_ID from dba_objects.
These numbers may be the same but not necessarily, and so if the database is open then identify this object, e.g.:
        select object_name, owner, object_type, data_object_id from dba_objects where object_id = <objn>;
This is the object, which has an active transaction.  Note in the above trace file extract that rci
has a value of 0x16 which means that this record is at the end of an undo chain.
This means that the chain continues in another UNDO BLOCK.
Please refer to unpublished Note:281504.1 for information on Undo chains.
So the next record that needs to be rolled back is present in REC #X016.
If rci is 0x00 then it means that this is the first record present in the undo chain
and so you can check to see if there is rdba info, e.g.:
* Rec #0x16 slt: 0x45 objn: 1485619(0x0016ab33) objd: 1485619 tblspc: 71(0x00000047)
* Layer: 11 (Row) opc: 1 rci 0x00
Undo type: Regular undo Begin trans Last buffer split: No
Temp Object: No
Tablespace Undo: No
rdba: 0x00000000
*-----------------------------
uba: 0x1081e796.1e14.14 ctl max scn: 0x0695.38f69853 prv tx scn: 0x0695.38f698a1
KDO undo record:
KTB Redo
op: 0x04 ver: 0x01
op: L itl: scn: 0x0019.009.00034237 uba: 0x36c0cce4.1d2f.19
flg: C--- lkc: 0 scn: 0x0695.38f6b96b
KDO Op code: URP xtype: XA bdba: 0x35406893 hdba: 0x35406892
itli: 1 ispac: 0 maxfr: 4863
tabn: 0 slot: 0(0x0) flag: 0x2c lock: 0 ckix: 0
ncol: 1 nnew: 1 size: -1
col 0: [ 4] c3 0e 36 2e
*-----------------------------
If the object is an Index, drop and recreate it.  If it is a table,
then again the table would need to be dropped and recreated (or truncated)
so that its object number changes and hence the rollback/undo is no longer required.
If this isn't possible, then you have two options:
First take a backup of the database in its current state.
This is critical in case anything goes wrong and you lose the opportunity to salvage the data!
Option 1
 - Restore the undo segment datafile and the datafile containing the object and perform a full recovery.
   This can only be done if you have all the archived redo as you will need to do full recovery on these files.
OR
Option 2
If option 1 is not possible, you can use the unsupported method, e.g.:
Specify the undo segment in the _OFFLINE_ROLLBACK_SEGMENTS parameter and try to drop the rollback segment.
If there is an active transaction then this is not likely to work and you will probably need
to set the _CORRUPTED_ROLLBACK_SEGMENTS parameter as well

温馨提示：
1.隐含参数_OFFLINE_ROLLBACK_SEGMENTS/_CORRUPTED_ROLLBACK_SEGMENTS属于Oracle内部隐含参数,建议在Oracle support认可的情况下使用,因为使用之后可能导致数据库事务完整性彻底损坏
2.进行屏蔽事务之前,如果条件允许建议使用txchecker检查
2.使用上述方法恢复数据库之后，建议通过逻辑方式导出导入重建数据库

Oracle 12c undo异常处理—root pdb undo异常

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：Oracle 12c undo异常处理—root pdb undo异常

在12c pdb环境中如果root pdb的undo文件异常，数据库该如何恢复呢？这篇文章模拟undo丢失的情况下进行恢复
模拟环境
三个会话，其中第一个会话对pdb1中的表进行操作，并且有事务未提交,第二个会话对pdb2进行操作，也未提交事务；第三个会话直接abort库，模拟突然库异常，然后删除root pdb下面的undo文件

--会话1
[oracle@ora1221 oradata]$ sqlplus  / as sysdba
SQL*Plus: Release 12.2.0.0.3 Production on Thu Jun 16 22:24:20 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
Database opened.
SQL>
SQL>
SQL>
SQL> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           MOUNTED
         4 PDB2                           MOUNTED
SQL> alter session set container=pdb1;
Session altered.
SQL> alter database open;
Database altered.
SQL>  create user chf identified by oracle;
User created.
SQL> grant dba to chf;
Grant succeeded.
SQL> create table chf.t_xifenfei_p1 as
  2  select * from dba_objects;
Table created.
SQL> insert into chf.t_xifenfei_p1
  2  select * from dba_objects;
72427 rows created.
SQL> select count(*) from chf.t_xifenfei_p1;
  COUNT(*)
----------
    144853
--会话2
[oracle@ora1221 ~]$ ss
SQL*Plus: Release 12.2.0.0.3 Production on Thu Jun 16 22:34:01 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.0.3 - 64bit Production
SQL> alter session set container=pdb2;
Session altered.
SQL> alter database open;
Database altered.
SQL>  create user chf identified by oracle;
User created.
SQL> grant dba to chf;
Grant succeeded.
SQL>  create table chf.t_xifenfei_p2
  2  as select * from dba_objects;
Table created.
SQL> delete from chf.t_xifenfei_p2;
72426 rows deleted.
SQL> select count(*) from chf.t_xifenfei_p2;
  COUNT(*)
----------
    0
--会话3
[oracle@ora1221 ~]$ ss
SQL*Plus: Release 12.2.0.0.3 Production on Thu Jun 16 22:36:16 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.0.3 - 64bit Production
SQL> shutdown abort
ORACLE instance shut down.
--删除cdb undo文件
[oracle@ora1221 orcl12c2]$ ls -ltr
total 2040912
drwxr-x---. 2 oracle oinstall      4096 Jun 16 18:26 pdbseed
drwxr-x---. 2 oracle oinstall      4096 Jun 16 18:27 pdb2
drwxr-x---. 2 oracle oinstall      4096 Jun 16 18:28 pdb1
-rw-r-----. 1 oracle oinstall 209715712 Jun 16 22:24 redo03.log
-rw-r-----. 1 oracle oinstall   5251072 Jun 16 22:24 users01.dbf
-rw-r-----. 1 oracle oinstall  34611200 Jun 16 22:25 temp01.dbf
-rw-r-----. 1 oracle oinstall 849354752 Jun 16 22:35 system01.dbf
-rw-r-----. 1 oracle oinstall  73408512 Jun 16 22:35 undotbs01.dbf
-rw-r-----. 1 oracle oinstall 492838912 Jun 16 22:35 sysaux01.dbf
-rw-r-----. 1 oracle oinstall 209715712 Jun 16 22:36 redo02.log
-rw-r-----. 1 oracle oinstall 209715712 Jun 16 22:36 redo01.log
-rw-r-----. 1 oracle oinstall  18726912 Jun 16 22:36 control02.ctl
-rw-r-----. 1 oracle oinstall  18726912 Jun 16 22:36 control01.ctl
[oracle@ora1221 orcl12c2]$ rm undotbs01.dbf
[oracle@ora1221 orcl12c2]$ ls -l un*
ls: cannot access un*: No such file or directory

启动数据库
由于有undo文件丢失数据库在启动的时候检测到文件丢失(ORA-01157)，无法open,offline文件后依旧无法启动(ORA-00376)

[oracle@ora1221 orcl12c2]$ ss
SQL*Plus: Release 12.2.0.0.3 Production on Thu Jun 16 22:51:21 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 4 - see DBWR trace file
ORA-01110: data file 4: '/u01/app/oracle/oradata/orcl12c2/undotbs01.dbf'
offline 数据文件
SQL> alter database datafile 4 offline ;
alter database datafile 4 offline
*
ERROR at line 1:
ORA-01145: offline immediate disallowed unless media recovery enabled
SQL>  alter database datafile 4 offline drop;
Database altered.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 4 cannot be read at this time
ORA-01110: data file 4: '/u01/app/oracle/oradata/orcl12c2/undotbs01.dbf'
Process ID: 7547
Session ID: 16 Serial number: 56234

把undo_management修改为manual后启动库，依旧报ORA-00376

SQL> startup pfile='/tmp/pfile' mount;
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
SQL> show parameter undo_management;
NAME                                 TYPE
------------------------------------ ----------------------
VALUE
------------------------------
undo_management                      string
MANUAL
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 4 cannot be read at this time
ORA-01110: data file 4: '/u01/app/oracle/oradata/orcl12c2/undotbs01.dbf'
Process ID: 7981
Session ID: 16 Serial number: 56572

设置_corrupted_rollback_segments参数

SQL> startup pfile='/tmp/pfile' mount;
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
SQL> show parameter _corrupted_rollback_segments;
NAME                                 TYPE
------------------------------------ ----------------------
VALUE
------------------------------
_corrupted_rollback_segments         string
_SYSSMU1_3200770482$, _SYSSMU2
_3597554035$, _SYSSMU3_2898427
493$, _SYSSMU4_670955920$, _SY
SSMU5_1233449977$, _SYSSMU6_32
67641983$, _SYSSMU7_2822479342
$, _SYSSMU8_1645196706$, _SYSS
MU9_3032014485$, _SYSSMU10_474
465626$
SQL> alter database open;
Database altered.

通过设置_corrupted_rollback_segments参数之后，数据库正常启动，下面继续其他pdb

open pdb1

SQL> alter session set container=pdb1;
Session altered.
SQL> alter database open;
Database altered.
SQL> select count(*) from chf.t_xifenfei_p1;
  COUNT(*)
----------
     72426

pdb2 open

SQL> alter session set container=pdb2;
Session altered.
SQL> alter database open;
Database altered.
SQL> select count(*) from chf.t_xifenfei_p2;
  COUNT(*)
----------
     72426

至此数据库基本上恢复完成，但是看到的pdb里面两个测试表的数据和我们预测的有一定的偏差,看来cdb中的undo和pdb中的undo还是有一定的依赖关系.同时也说明了root的undo异常对于其他pdb的open最少在恢复上面影响不大.下一篇测试业务pdb中undo异常处理

Oracle 12c redo 丢失恢复

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：Oracle 12c redo 丢失恢复

模拟redo丢失
对数据库的一个pdb模拟事务操作，然后abort库，并且删除所有redo，模拟生产环境redo丢失的case

[oracle@ora1221 oradata]$ ss
SQL*Plus: Release 12.2.0.0.3 Production on Wed Jun 15 10:13:20 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
Database opened.
SQL>
SQL>
SQL> set pages 100
SQL> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           MOUNTED
         4 PDB2                           MOUNTED
SQL> select con_id,file#,checkpoint_change# from v$datafile_header order by 1;
    CON_ID      FILE# CHECKPOINT_CHANGE#
---------- ---------- ------------------
         1          1            1500157
         1          3            1500157
         1          4            1500157
         1          7            1500157
         2          5            1371280
         2          6            1371280
         2          8            1371280
         3          9            1499902
         3         12            1499902
         3         11            1499902
         3         10            1499902
         4         15            1499903
         4         14            1499903
         4         13            1499903
         4         16            1499903
15 rows selected.
SQL> alter PLUGGABLE database pdb1 open;
Pluggable database altered.
SQL> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO
         4 PDB2                           MOUNTED
SQL> alter session set container=pdb1;
Session altered.
SQL> create user chf identified by oracle;
User created.
SQL> grant dba to chf;
Grant succeeded.
SQL> create table chf.t_pdb1_xifenfei as select * from dba_objects;
Table created.
SQL> delete from chf.t_pdb1_xifenfei;
72426 rows deleted.
--另外一个节点
[oracle@ora1221 ~]$ ss
SQL*Plus: Release 12.2.0.0.3 Production on Wed Jun 15 10:19:21 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.0.3 - 64bit Production
SQL> shutdown abort
ORACLE instance shut down.
[oracle@ora1221 orcl12c2]$ ls redo*
redo01.log  redo02.log  redo03.log
[oracle@ora1221 orcl12c2]$ rm redo0*
[oracle@ora1221 orcl12c2]$ ls -l redo*
ls: cannot access redo*: No such file or directory

尝试启动数据库

[oracle@ora1221 orcl12c2]$ ss
SQL*Plus: Release 12.2.0.0.3 Production on Wed Jun 15 10:26:20 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/u01/app/oracle/oradata/orcl12c2/redo01.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7

使用隐含参数启动

----pfile里面增加
_allow_error_simulation=TRUE
_allow_resetlogs_corruption=true
~
SQL> shutdown immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down
SQL> startup pfile='/tmp/pfile' mount
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [],
[], [], [], [], [], [], []
Process ID: 36797
Session ID: 16 Serial number: 24277

继续重启库
ORA-600 kcbzib_kcrsds_1错误尝试重启数据库,如果不行,考虑使用bbed修改文件头信息

SQL> startup mount pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
SQL> recover database until cancel;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database or pluggable database must be opened in read/write
mode.
SQL> alter database backup controlfile to trace as '/tmp/ctl';
alter database backup controlfile to trace as '/tmp/ctl'
*
ERROR at line 1:
ORA-16433: The database or pluggable database must be opened in read/write
mode.

重建控制文件

SQL> startup nomount pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE "orcl12c2" RESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 50
  3      MAXLOGMEMBERS 5
  4      MAXDATAFILES 100
  5      MAXINSTANCES 1
  6      MAXLOGHISTORY 226
  7  LOGFILE
  8    GROUP 1 '/u01/app/oracle/oradata/orcl12c2/redo01.log'  SIZE 200M,
  9    GROUP 2 '/u01/app/oracle/oradata/orcl12c2/redo02.log'  SIZE 200M,
 10    GROUP 3 '/u01/app/oracle/oradata/orcl12c2/redo03.log'  SIZE 200M
 11  DATAFILE
 12  '/u01/app/oracle/oradata/orcl12c2/system01.dbf',
 13  '/u01/app/oracle/oradata/orcl12c2/sysaux01.dbf',
 14  '/u01/app/oracle/oradata/orcl12c2/undotbs01.dbf',
 15  '/u01/app/oracle/oradata/orcl12c2/pdbseed/system01.dbf',
 16  '/u01/app/oracle/oradata/orcl12c2/pdbseed/sysaux01.dbf',
 17  '/u01/app/oracle/oradata/orcl12c2/users01.dbf',
 18  '/u01/app/oracle/oradata/orcl12c2/pdbseed/undotbs01.dbf',
 19  '/u01/app/oracle/oradata/orcl12c2/pdb1/system01.dbf',
 20  '/u01/app/oracle/oradata/orcl12c2/pdb1/sysaux01.dbf',
 21  '/u01/app/oracle/oradata/orcl12c2/pdb1/undotbs01.dbf',
 22  '/u01/app/oracle/oradata/orcl12c2/pdb1/users01.dbf',
 23  '/u01/app/oracle/oradata/orcl12c2/pdb2/system01.dbf',
 24  '/u01/app/oracle/oradata/orcl12c2/pdb2/sysaux01.dbf',
 25  '/u01/app/oracle/oradata/orcl12c2/pdb2/undotbs01.dbf',
 26  '/u01/app/oracle/oradata/orcl12c2/pdb2/users01.dbf'
 27  CHARACTER SET AL32UTF8
 28  ;
Control file created.
SQL> recover database until cancel;
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done
SQL> recover database until cancel using backup controlfile;
ORA-00279: change 1500161 generated at 06/15/2016 10:40:42 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/product/12.2.0/db_2/dbs/arch1_1_914582438.dbf
ORA-00280: change 1500161 for thread 1 is in sequence #1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/app/oracle/oradata/orcl12c2/redo01.log
Log applied.
Media recovery complete.
SQL> alter database open resetlogs;
Database altered.

<strong>open过程alert日志</strong>

<strong>open pdb1</strong>

SQL> alter PLUGGABLE database pdb1  open;
Pluggable database altered.

pdb1 open alert日志

2016-06-15T11:13:39.423057+08:00
alter PLUGGABLE database pdb1  open
PDB1(3):Autotune of undo retention is turned on.
2016-06-15T11:13:39.495559+08:00
PDB1(3):Endian type of dictionary set to little
PDB1(3):[40547] Successfully onlined Undo Tablespace 2.
PDB1(3):Undo initialization finished serial:0 start:371149831 end:371149872 diff:41 ms (0.0 seconds)
PDB1(3):Database Characterset for PDB1 is AL32UTF8
PDB1(3):*********************************************************************
PDB1(3):WARNING: The following temporary tablespaces in container(PDB1)
PDB1(3):         contain no files.
PDB1(3):         This condition can occur when a backup controlfile has
PDB1(3):         been restored.  It may be necessary to add files to these
PDB1(3):         tablespaces.  That can be done using the SQL statement:
PDB1(3):
PDB1(3):         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
PDB1(3):
PDB1(3):         Alternatively, if these temporary tablespaces are no longer
PDB1(3):         needed, then they can be dropped.
PDB1(3):           Empty temporary tablespace: TEMP
PDB1(3):*********************************************************************
PDB1(3):Opatch validation is skipped for PDB PDB1 (con_id=0)
PDB1(3):Opening pdb with no Resource Manager plan active
Pluggable database PDB1 opened read write
Completed: alter PLUGGABLE database pdb1  open

open pdb2

SQL> alter PLUGGABLE database pdb2 open;
alter PLUGGABLE database pdb2 open
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [17090], [], [], [], [], [], [], [],
[], [], [], []

分析alert日志和trace文件

--alert日志部分
PDB1(3):alter PLUGGABLE database pdb2 open
PDB1(3):ORA-65118 signalled during: alter PLUGGABLE database pdb2 open...
2016-06-15T11:28:57.439963+08:00
PDB1(3):Unified Audit: Audit record write to table failed due to ORA-25153.
Writing the audit record to OS spillover file. Please grep in the trace files for ORA-25153 for more diagnostic information.
Errors in file /u01/app/oracle/diag/rdbms/orcl12c2/orcl12c2/trace/orcl12c2_ora_40547.trc  (incident=29073) (PDBNAME=PDB1):
ORA-00600: internal error code, arguments: [17090], [], [], [], [], [], [], [], [], [], [], []
PDB1(3):Incident details in: /u01/app/oracle/diag/rdbms/orcl12c2/orcl12c2/incident/incdir_29073/orcl12c2_ora_40547_i29073.trc
PDB1(3):*****************************************************************
PDB1(3):An internal routine has requested a dump of selected redo.
PDB1(3):This usually happens following a specific internal error, when
PDB1(3):analysis of the redo logs will help Oracle Support with the
PDB1(3):diagnosis.
PDB1(3):It is recommended that you retain all the redo logs generated (by
PDB1(3):all the instances) during the past 12 hours, in case additional
PDB1(3):redo dumps are required to help with the diagnosis.
PDB1(3):*****************************************************************
2016-06-15T11:28:59.123041+08:00
PDB1(3):Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2016-06-15T11:28:59.945667+08:00
Dumping diagnostic data in directory=[cdmp_20160615112859], requested by (instance=1, osid=40547), summary=[incident=29073].
2016-06-15T11:35:59.987419+08:00
PDB1(3): alter PLUGGABLE database pdb2 open
PDB1(3):ORA-65118 signalled during:  alter PLUGGABLE database pdb2 open...
--trace部分
PARSING IN CURSOR #0x7f051a3d7650 len=118 dep=1 uid=0 oct=3 lid=0 tim=372490287736 hv=1128335472 ad='0x6ca82f00' sqlid='gu930gd1n223h'
select tablespace_name, tablespace_size, allocated_space, free_space, con_id  from cdb_temp_free_space order by con_id
END OF STMT
EXEC #0x7f051a3d7650:c=0,e=144,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2538033465,tim=372490287732
FETCH #0x7f051a3d7650:c=0,e=290,p=0,cr=14,cu=0,mis=0,r=0,dep=1,og=4,plh=2538033465,tim=372490288109
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 52 FileOperation=2 fileno=0 filetype=36 obj#=402 tim=372490288373
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 17 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490288577
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 3 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490288655
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 690 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490289365
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 6 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490289470
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 445 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490289934
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 3 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490289983
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 375 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490290374
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 6 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490290453
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 367 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490290839
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 3 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490290882
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 355 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490291252
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 4 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490291298
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 276 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490291590
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 1 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490291614
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 256 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490291879
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 2 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490291903
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 261 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490292172
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 30 FileOperation=3 fileno=0 filetype=36 obj#=402 tim=372490292225
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 934 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490293171
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 3 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490293208
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 245 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490293465
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 262 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490293755
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 1 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490293780
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 250 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490294039
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 256 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490294323
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 8 FileOperation=5 fileno=0 filetype=36 obj#=402 tim=372490294359
2016-06-15T11:36:00.055196+08:00
Incident 29074 created, dump file: /u01/app/oracle/diag/rdbms/orcl12c2/orcl12c2/incident/incdir_29074/orcl12c2_ora_40547_i29074.trc
ORA-00600: internal error code, arguments: [17090], [], [], [], [], [], [], [], [], [], [], []

从中可以判断出来是由于CDB$ROOT的未增加tempfile导致

SQL> alter tablespace temp add tempfile '/u01/app/oracle/oradata/orcl12c2/temp01.dbf' reuse;
Tablespace altered.
SQL>  alter PLUGGABLE database pdb2 open;
Pluggable database altered.

查看数据库恢复情况

SQL> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO
         4 PDB2                           READ WRITE NO
SQL> select con_id,file#,checkpoint_change#,resetlogs_change# from v$datafile_header;
    CON_ID      FILE# CHECKPOINT_CHANGE# RESETLOGS_CHANGE#
---------- ---------- ------------------ -----------------
         1          1            2500167           1500164
         1          3            2500167           1500164
         1          4            2500167           1500164
         2          5            1371280           1341067
         2          6            1371280           1341067
         1          7            2500167           1500164
         2          8            1371280           1341067
         3          9            2501017           1500164
         3         10            2501017           1500164
         3         11            2501017           1500164
         3         12            2501017           1500164
         4         13            2502748           1500164
         4         14            2502748           1500164
         4         15            2502748           1500164
         4         16            2502748           1500164
15 rows selected.

至此基本上测试完成在在cdb环境中丢失redo的恢复。在生产中，需要把temp加全，并且建议重建数据库

oradebug poke ORA-32521/ORA-32519故障解决

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：oradebug poke ORA-32521/ORA-32519故障解决

最近有不少朋友咨询12.1.0.2及其以后的版本使用oradebug去修改scn失败,这里做了一个测试正常情况下确实无法修改(oradebug poke报 ORA-32519 或者 ORA-32521) ,这里进行了一系列修改测试最后修改成功.
数据库版本

SQL> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production              0
PL/SQL Release 12.1.0.2.0 - Production                                                    0
CORE    12.1.0.2.0      Production                                                        0
TNS for 64-bit Windows: Version 12.1.0.2.0 - Production                                   0
NLSRTL Version 12.1.0.2.0 - Production                                                    0

oradebug poke测试

SQL> oradebug setmypid
已处理的语句
SQL> oradebug DUMPvar SGA kcsgscn_
kcslf kcsgscn_ [14C8D6270, 14C8D62A0) = 009EA333 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 4C8D5CF0 00000001
SQL> oradebug poke 0x14C8D6274 4 0x00000001
ORA-32521: 对 ORADEBUG 命令  进行语法分析时出错
--或者该提示
SQL> oradebug setmypid
已处理的语句
SQL> oradebug DUMPvar SGA kcsgscn_
kcslf kcsgscn_ [14C8D6270, 14C8D62A0) = 009EAE3D 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 4C8D5CF0 00000001
SQL> oradebug poke 0x14C8D6274 4 0x0000000a
ORA-32519: 权限不足, 无法执行 ORADEBUG 命令: execution of ORADEBUG commands is disabled for this instance

通过测试确定oradebug正常情况无法执行poke,不是提示ORA-32521就是提示ORA-32519错误导致scn无法修改.

通过一些修改之后oradebug 修改scn

SQL> select dbid, name,open_mode,
  2         created created,
  3         open_mode, log_mode,
  4         checkpoint_change# as checkpoint_change#,
  5         controlfile_type ctl_type,
  6         controlfile_created ctl_created,
  7         controlfile_change# as ctl_change#,
  8         controlfile_time ctl_time,
  9         resetlogs_change# as resetlogs_change#,
 10         resetlogs_time resetlogs_time
 11  from v$database;
      DBID NAME                                                 OPEN_MODE            CREATED        OPEN_MODE
 LOG_MODE
---------- ---------------------------------------------------- -------------------- -------------- ------------
 ------------
CHECKPOINT_CHANGE# CTL_TYP CTL_CREATED    CTL_CHANGE# CTL_TIME       RESETLOGS_CHANGE# RESETLOGS_TIME
------------------ ------- -------------- ----------- -------------- ----------------- --------------
1504692401 XIFENFEI                                             READ WRITE           16-8月 -15     READ WRITE
 ARCHIVELOG
          10407853 CURRENT 16-8月 -15        10408361 07-7月 -16                     1 16-8月 -15
SQL> select con_id,file#,checkpoint_change# from v$datafile_header;
    CON_ID      FILE# CHECKPOINT_CHANGE#
---------- ---------- ------------------
         1          1           10407853
         2          2            9457324
         1          3           10407853
         2          4            9457324
         1          5           10407853
         1          6           10407853
         3          7           10407853
         3          8           10407853
         3          9           10407853
         4         10            9559964
         4         11            9559964
         4         12            9559964
         3         13           10407853
已选择 13 行。
SQL> shutdown abort;
ORACLE 例程已经关闭。
SQL> startup
ORACLE 例程已经启动。
Total System Global Area 3221225472 bytes
Fixed Size                  3837232 bytes
Variable Size             838861520 bytes
Database Buffers         2365587456 bytes
Redo Buffers               12939264 bytes
数据库装载完毕。
SQL> oradebug setmypid
已处理的语句
SQL> oradebug DUMPvar SGA kcsgscn_
kcslf kcsgscn_ [14BE16270, 14BE162A0) = 009ED5C6 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 4BE15CF0 00000001
SQL> oradebug poke 0x14BE16274 4 0x0001
BEFORE: [14BE16274, 14BE16276) = 0000
AFTER: [14BE16274, 14BE16276) = 0001
SQL> alter database open;
数据库已更改。
SQL> select dbid, name,open_mode,
  2         created created,
  3         open_mode, log_mode,
  4         checkpoint_change# as checkpoint_change#,
  5         controlfile_type ctl_type,
  6         controlfile_created ctl_created,
  7         controlfile_change# as ctl_change#,
  8         controlfile_time ctl_time,
  9         resetlogs_change# as resetlogs_change#,
 10         resetlogs_time resetlogs_time
 11  from v$database;
      DBID NAME                                                 OPEN_MODE            CREATED        OPEN_MODE
 LOG_MODE
---------- ---------------------------------------------------- -------------------- -------------- -----------
 ------------
CHECKPOINT_CHANGE# CTL_TYP CTL_CREATED    CTL_CHANGE# CTL_TIME       RESETLOGS_CHANGE# RESETLOGS_TIME
------------------ ------- -------------- ----------- -------------- ----------------- --------------
1504692401 XIFENFEI                                             READ WRITE           16-8月 -15     READ WRITE
 ARCHIVELOG
        4305478053 CURRENT 16-8月 -15      4305478245 07-7月 -16                     1 16-8月 -15
SQL> select con_id,file#,checkpoint_change# from v$datafile_header;
    CON_ID      FILE# CHECKPOINT_CHANGE#
---------- ---------- ------------------
         1          1         4305478053
         2          2            9457324
         1          3         4305478053
         2          4            9457324
         1          5         4305478053
         1          6         4305478053
         3          7         4305478053
         3          8         4305478053
         3          9         4305478053
         4         10            9559964
         4         11            9559964
         4         12            9559964
         3         13         4305478053
已选择 13 行。
SQL> select con_id,file#,checkpoint_change# from v$datafile;
    CON_ID      FILE# CHECKPOINT_CHANGE#
---------- ---------- ------------------
         1          1         4305478053
         2          2            9457324
         1          3         4305478053
         2          4            9457324
         1          5         4305478053
         1          6         4305478053
         3          7         4305478053
         3          8         4305478053
         3          9         4305478053
         4         10            9559964
         4         11            9559964
         4         12            9559964
         3         13         4305478053
已选择 13 行。

通过上述测试证明scn已经被完美修改.证明我们已经具备了不使用bbed的情况下推进12.1.0.2版本的scn问题,为12c的一系列需要推scn的恢复提供完美技术支持.

pvid=yes导致asm无法mount

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：pvid=yes导致asm无法mount

今天凌晨接到客户恢复请求，对于aix rac，两个ibm存储做mirror的环境中，客户做存储容灾演练，发现磁盘的名称发生改变，然后对其中一个磁盘设置pvid，结果悲剧了导致asm一个磁盘组无法正常起来。然后又aix端删除这些设备，然后重新扫描设备。结果不是一个磁盘组不能mount，而是整个gi就无法正常启动。希望我们给予技术支持。
查看asm 日志，确定asm disk信息
asm-disk1
asm-disk2

从这里可以确定，一共有两个asm diskgroup，每个group有两个磁盘，hdisk2和hdisk3 为hisdata，hdisk4,和hdisk5为emrdata.

使用kfed分析磁盘头

dd if=/dev/rhdisk2 of=/tmp/xifenfei/rhdisk2.dd bs=1024k count=10
dd if=/dev/rhdisk3 of=/tmp/xifenfei/rhdisk3.dd bs=1024k count=10
dd if=/dev/rhdisk4 of=/tmp/xifenfei/rhdisk4.dd bs=1024k count=10
dd if=/dev/rhdisk5 of=/tmp/xifenfei/rhdisk5.dd bs=1024k count=10
--传输到我电脑上分析
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk2.dd|grep name
kfdhdb.dskname:            HISDATA_0000 ; 0x028: length=12
kfdhdb.grpname:                 HISDATA ; 0x048: length=7
kfdhdb.fgname:             HISDATA_0000 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk3.dd|grep name
kfdhdb.dskname:            HISDATA_0001 ; 0x028: length=12
kfdhdb.grpname:                 HISDATA ; 0x048: length=7
kfdhdb.fgname:             HISDATA_0001 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk4.dd|grep name
kfdhdb.dskname:            EMRDATA_0000 ; 0x028: length=12
kfdhdb.grpname:                 EMRDATA ; 0x048: length=7
kfdhdb.fgname:             EMRDATA_0000 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd|grep name
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd
kfbh.endian:                        201 ; 0x000: 0xc9
kfbh.hard:                          194 ; 0x001: 0xc2
kfbh.type:                          212 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                        193 ; 0x003: 0xc1
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
000000000 C1D4C2C9 00000000 00000000 00000000  [................]
000000010 00000000 00000000 00000000 00000000  [................]
  Repeat 254 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][212]
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd blkn=2|grep kfbh
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                33554432 ; 0x004: blk=33554432
kfbh.block.obj:                16777344 ; 0x008: file=128
kfbh.check:                  2654889601 ; 0x00c: 0x9e3e6681
kfbh.fcn.base:               1696071680 ; 0x010: 0x65180000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd blkn=510|grep name
kfdhdb.dskname:            EMRDATA_0001 ; 0x028: length=12
kfdhdb.grpname:                 EMRDATA ; 0x048: length=7
kfdhdb.fgname:             EMRDATA_0001 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0

通过上述分析，基本上确定由于对hdisk5设置了pvid导致该asm disk的磁盘头损坏.这个可以直接使用asm repair功能修复(注意要clear pvid)

C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd |grep name
kfdhdb.dskname:            EMRDATA_0001 ; 0x028: length=12
kfdhdb.grpname:                 EMRDATA ; 0x048: length=7
kfdhdb.fgname:             EMRDATA_0001 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0

启动crs到cssd进程报错分析
1. 由于删除磁盘，扫描设备导致hdisk[2-5] 权限和用户组不对
2. 由于删除，扫描磁盘导致磁盘共享模式不对
修复磁盘头和解决这两个问题之后,gi启动正常,磁盘组也正常mount，数据库也正常启动,数据0丢失,至此完美恢复
oracle-open

类似客户恢复案例:asm disk误设置pvid导致asm diskgroup无法mount恢复
如果您遇到此类情况,无法解决请联系我们，提供专业ORACLE数据库恢复技术支持
Phone:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com

ORA-01555 ORA-600 kdiulk:kcbz_objdchk ORA-600 kdBlkCheckError等错误恢复

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：ORA-01555 ORA-600 kdiulk:kcbz_objdchk ORA-600 kdBlkCheckError等错误恢复

数据库启动ORA-00704,0RA-00604,ORA-01555导致数据库无法启动

Tue May 31 17:32:42 2016
SMON: enabling cache recovery
SUCCESS: diskgroup RECOVERY was mounted
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0004.3af84bee):
select ctime, mtime, stime from obj$ where obj# = :1
Archived Log entry 5 added for thread 1 sequence 10 ID 0x86a261e7 dest 1:
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_12779.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 7 with name "_SYSSMU7_1592079335$" too small
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_12779.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 7 with name "_SYSSMU7_1592079335$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 12779): terminating the instance due to error 704

通过bbed修改事务之后启动数据库

Tue May 31 17:35:49 2016
SMON: enabling tx recovery
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
         This condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
Updating character set in controlfile to AL32UTF8
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p021_13862.trc  (incident=166002):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p010_13818.trc  (incident=165914):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p004_13794.trc  (incident=165866):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p011_13822.trc  (incident=165922):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p016_13842.trc  (incident=165962):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []

ORA-600 [kdiulk:kcbz_objdchk] trace文件

*** SESSION ID:(3.5) 2016-05-31 17:35:50.068
OBJD MISMATCH typ=6, seg.obj=-2, diskobj=222225, dsflg=0, dsobj=285890, tid=285890, cls=1
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Parallel Transaction recovery server caught exception 600
begin Parallel Recovery Context Dump
nsi: 48, nsactive: 48
, nirsi: 1, nidti: 1, ndt: 1, rescan: 0, ptrs: 48
[ktprsi] wdone: 50
[ktpritp 378651b8] ktprsi:
37903b60 37903b78 37903b90 37903ba8 37903bc0 37903bd8 37903bf0 37903c08 37903c20 37903c38 37903c50
37903c68 37903c80 37903c98 37903cb0 37903cc8 37903ce0 37903cf8 37903d10 37903d28 37903d40 37903d58
37903d70 37903d88 37903da0 37903db8 37903dd0 37903de8 37903e00 37903e18 37903e30 37903e48 37903e60
37903e78 37903e90 37903ea8 37903ec0 37903ed8 37903ef0 37903f08 37903f20 37903f38 37903f50 37903f68
37903f80 37903f98 37903fb0 37903fc8
[ktprht] nhb: 47, nfl: 20247, flg: 2
*** 2016-05-31 17:36:08.584
[ktprhb] nfl: 1, nelem: 97, flg: 0, sqn: 1
flist: 37698940
nhe: [ktprhe 32] sqn: -1297235803
[kturur] uoff: -1797708320, sqn: 4
uba: 0x098004cd.07e4.0b
*-----------------------------
* Rec #0xb  slt: 0x07  objn: 123986(0x0001e452)  objd: 285891  tblspc: 10(0x0000000a)
*       Layer:  10 (Index)   opc: 22   rci 0x0a
Undo type:  Regular undo   Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
rdba: 0x00000000

这里基本上可以确定是由于undo index中的dataobj#和block中的dataobj# 不匹配.在数据库undo回滚之时出现该错误.可以通过跳过undo回滚,然后重建对象

Tue May 31 17:36:06 2016
Simulated error on redo application.
Block recovery from logseq 12, block 959 to scn 20401094719
Recovery of Online Redo Log: Thread 1 Group 3 Seq 12 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_3.263.802446627
Block recovery completed at rba 12.1012.16, scn 4.3221225536
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Simulated error for redo application done.
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p009_13814.trc  (incident=165906):
ORA-00600: 内部错误代码, 参数: [kdBlkCheckError], [26], [950417], [18025], [], [], [], [], [], [], [], []

这些错误是由于数据库block逻辑异常导致,错过参数含义
在10g中ORA-600 kddummy_blkchk 在11g中ORA-600 kdBlkCheckError

ARGUMENTS:
Arg [a] Absolute file number
Arg [b] Bock number
Arg  Internal error code returned from kcbchk() which indicates the problem encountered.
See Note 46389.1 for details of block check codes.

根据QREF kddummy_blkchk / kdBlkCheckError Check Codes Listing (Full) (Doc ID 1264040.1)分析
这里的18025是代码的KCBTEMAP_EC_START + KTS4_EC_SBFREE部分异常，主要表现在Incorrect firstfree or nfree 可以通过设置一些参数进行屏蔽

在恢复过程中还有其他错误

ORA-600 encountered when generating server alert SMG-4128
ORA-00600: internal error code, arguments: [ktcpoptx:!cmt top lvl], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4406], [0x1026B65348], [0x000000000], [2], [6215], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [ktcpoptx:!cmt top lvl], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
ORACLE Instance xifenfei (pid = 15) - Error 600 encountered while recovering transaction (10, 7) on object 123986.
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kturbleurec1], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kewrose_1], [600],
  [ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.

通过整体分析错误主要是由于undo异常导致,通过设置_corrupted_rollback_segments设置db_block_checking等相关参数,清理SMON_SCN_TIME等操作数据库没有其他异常报错,让其通过逻辑方式重建库

强制关机导致数据库无法正常启动恢复

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：强制关机导致数据库无法正常启动恢复

有客户qq找到我,说有朋友推荐,让我帮他们恢复数据库.由于强制关机后,数据库无法正常启动.
数据库recover database失败

Mon Mar 28 10:20:33 2016
ALTER DATABASE RECOVER  database
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 32 slaves
Mon Mar 28 10:20:36 2016
Recovery of Online Redo Log: Thread 1 Group 2 Seq 18686 Reading mem 0
  Mem# 0: E:\ORACLE_DATA\YCCY\REDO02.LOG
Recovery of Online Redo Log: Thread 1 Group 3 Seq 18687 Reading mem 0
  Mem# 0: E:\ORACLE_DATA\YCCY\REDO03.LOG
Recovery of Online Redo Log: Thread 1 Group 1 Seq 18688 Reading mem 0
  Mem# 0: E:\ORACLE_DATA\YCCY\REDO01.LOG
Mon Mar 28 10:20:38 2016
Hex dump of (file 45, block 7431) in trace file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0q_2968.trc
Corrupt block relative dba: 0x0b401d07 (file 45, block 7431)
Mon Mar 28 10:20:38 2016
Hex dump of (file 45, block 7836) in trace file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr01_2220.trc
Bad header found during media recovery
Corrupt block relative dba: 0x0b401e9c (file 45, block 7836)
Data in bad block:
Bad header found during media recovery
 type: 0 format: 0 rdba: 0x1d070000
 last change scn: 0x4917.f8dc0b40 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0xc7f7
 consistency value in tail: 0x06010000
 check value in block header: 0x601
 block checksum disabled
Reading datafile 'E:\ORACLE_DATA\YCCY\DT_SYS_IDX12.DBF' for corruption at rdba: 0x0b401d07 (file 45, block 7431)
Reread (file 45, block 7431) found valid data
Repaired corruption at (file 45, block 7431)
Hex dump of (file 45, block 7556) in trace file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0q_2968.trc
Corrupt block relative dba: 0x0b401d84 (file 45, block 7556)
Bad header found during media recovery
Data in bad block:
 type: 106 format: 3 rdba: 0x1d840000
 last change scn: 0x461d.391a0b40 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0x2499
 consistency value in tail: 0x06013999
 check value in block header: 0x401
 block checksum disabled
Reading datafile 'E:\ORACLE_DATA\YCCY\DT_SYS_IDX12.DBF' for corruption at rdba: 0x0b401d84 (file 45, block 7556)
Reread (file 45, block 7556) found valid data
Repaired corruption at (file 45, block 7556)
Mon Mar 28 10:20:38 2016
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x1334748, kcbzfw()+3094]
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0k_3900.trc  (incident=131189):
ORA-00600: internal error code, arguments: [kcbr_validate_read_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131189\yccy_pr0k_3900_i131189.trc
ERROR: Unable to normalize symbol name for the following short stack (at offset 199):
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0r_3060.trc  (incident=131245):
ORA-07445: exception encountered: core dump [kcbzfw()+3094] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x1334748] [UNABLE_TO_READ] []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 169345, file offset is 1387274240 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: data file 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131245\yccy_pr0r_3060_i131245.trc
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C, kcbzdh()+942]
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0d_2112.trc  (incident=131133):
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131133\yccy_pr0d_2112_i131133.trc
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0e_3260.trc  (incident=131141):
ORA-00600: internal error code, arguments: [3020], [5], [163457], [21134977], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 163457, file offset is 1339039744 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: data file 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131141\yccy_pr0e_3260_i131141.trc
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr04_3980.trc  (incident=131021):
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131021\yccy_pr04_3980_i131021.trc
Data in bad block:
 type: 0 format: 0 rdba: 0x1e9c0000
 last change scn: 0x4915.f8320b40 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0x8029
 consistency value in tail: 0x0602e40c
 check value in block header: 0x602
 block checksum disabled
Reading datafile 'E:\ORACLE_DATA\YCCY\DT_SYS_IDX12.DBF' for corruption at rdba: 0x0b401e9c (file 45, block 7836)
Reread (file 45, block 7836) found valid data
Repaired corruption at (file 45, block 7836)
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0f_816.trc  (incident=131149):
ORA-00600: internal error code, arguments: [kcbr_validate_read_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131149\yccy_pr0f_816_i131149.trc
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C, kcbzdh()+942]
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0i_2132.trc  (incident=131173):
ORA-00600: internal error code, arguments: [3020], [5], [154240], [21125760], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 154240, file offset is 1263534080 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: data file 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131173\yccy_pr0i_2132_i131173.trc
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0k_3900.trc  (incident=131190):
ORA-07445: exception encountered: core dump [kcbzdh()+942] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kcbr_validate_read_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131190\yccy_pr0k_3900_i131190.trc
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr01_2220.trc  (incident=131037):
ORA-00600: internal error code, arguments: [kcbrapply_14], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131037\yccy_pr01_2220_i131037.trc
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C, kcbzdh()+942]
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0f_816.trc  (incident=131150):
ORA-07445: exception encountered: core dump [kcbzdh()+942] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kcbr_validate_read_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131150\yccy_pr0f_816_i131150.trc
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr01_2220.trc  (incident=131038):
ORA-07445: exception encountered: core dump [kcbzdh()+942] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kcbrapply_14], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131038\yccy_pr01_2220_i131038.trc
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0h_4036.trc  (incident=131165):
ORA-00600: internal error code, arguments: [kcbr_validate_read_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131165\yccy_pr0h_4036_i131165.trc
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C, kcbzdh()+942]
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC13B, kcbzpnd()+299]
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x1351BB9, kcbs_dump_adv_state()+1529]
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC13B, kcbzpnd()+299]
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0h_4036.trc  (incident=131166):
ORA-07445: exception encountered: core dump [kcbzdh()+942] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kcbr_validate_read_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131166\yccy_pr0h_4036_i131166.trc
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC13B, kcbzpnd()+299]
Mon Mar 28 10:20:40 2016
Checker run found 60 new persistent data failures
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0d_2112.trc  (incident=131134):
ORA-07445: exception encountered: core dump [kcbzpnd()+299] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC13B] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131134\yccy_pr0d_2112_i131134.trc
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr04_3980.trc  (incident=131022):
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+1529] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x1351BB9] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131022\yccy_pr04_3980_i131022.trc
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0e_3260.trc  (incident=131142):
ORA-07445: exception encountered: core dump [kcbzpnd()+299] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC13B] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [3020], [5], [163457], [21134977], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 163457, file offset is 1339039744 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: data file 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131142\yccy_pr0e_3260_i131142.trc
Mon Mar 28 10:20:41 2016
Trace dumping is performing id=[cdmp_20160328102041]
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0i_2132.trc  (incident=131174):
ORA-07445: exception encountered: core dump [kcbzpnd()+299] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC13B] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [3020], [5], [154240], [21125760], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 154240, file offset is 1263534080 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: data file 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131174\yccy_pr0i_2132_i131174.trc
Mon Mar 28 10:20:41 2016
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x2E7FFFFFE] [PC:0x74CAE3F0, 0000000074CAE3F0]
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr06_2684.trc  (incident=131077):
ORA-07445: exception encountered: core dump [PC:0x74CAE3F0] [ACCESS_VIOLATION] [ADDR:0x2E7FFFFFE] [PC:0x74CAE3F0] [UNABLE_TO_READ] []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131077\yccy_pr06_2684_i131077.trc
Mon Mar 28 10:20:42 2016
Exception [type: ACCESS_VIOLATION, UNABLE_TO_WRITE] [ADDR:0x0] [PC:0x4D20D2, kslgetl()+54]
Mon Mar 28 10:20:42 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pmon_3856.trc  (incident=130853):
ORA-07445: exception encountered: core dump [kslgetl()+54] [ACCESS_VIOLATION] [ADDR:0x0] [PC:0x4D20D2] [UNABLE_TO_WRITE] []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_130853\yccy_pmon_3856_i130853.trc
Trace dumping is performing id=[cdmp_20160328102042]
Errors in file d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131077\yccy_pr06_2684_i131077.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [PC:0x74CAE3F0] [ACCESS_VIOLATION] [ADDR:0x2E7FFFFFE] [PC:0x74CAE3F0] [UNABLE_TO_READ] []
Process debug not enabled via parameter _debug_enable
Trace dumping is performing id=[cdmp_20160328102043]
Mon Mar 28 10:21:01 2016
RECO (ospid: 3524): terminating the instance due to error 472
Instance terminated by RECO, pid = 3524

通过观察这段日志,基本上可以发现主要是FILE 45,虽然提示坏块但是最终验证确定为正常块（类似：Reread (file 45, block 7836) found valid data),这里主要是file 5，报了大量的ORA-600[3020].

对数据文件逐个进行recover操作

SQL> startup mount;
ORACLE 例程已经启动。
Total System Global Area 1.7103E+10 bytes
Fixed Size                  2192864 bytes
Variable Size            9059699232 bytes
Database Buffers         8019509248 bytes
Redo Buffers               21762048 bytes
数据库装载完毕。
SQL> recover datafile 1;
完成介质恢复。
SQL> recover  datafile 2;
ORA-03113: 通信通道的文件结尾
进程 ID: 1652
会话 ID: 551 序列号: 55
SQL> recover datafile 3;
完成介质恢复。
SQL> recover datafile 4;
完成介质恢复。
SQL> recover datafile 5;
ORA-03113: 通信通道的文件结尾
进程 ID: 4900
会话 ID: 551 序列号: 56131
SQL> recover datafile 6;
完成介质恢复。
…………
SQL> recover datafile 63;
完成介质恢复。
SQL> recover datafile 64;
完成介质恢复。

除掉datafile 2，5之外，其他文件全部recover成功.

对于file 2 尝试处理
无法通过recover成功,只能暂时放弃,后续考虑先offline open库,然后把这个文件强制online

SQL> recover  datafile 2 ;
ORA-03113: 通信通道的文件结尾
进程 ID: 5020
会话 ID: 551 序列号: 3
Mon Mar 28 10:47:12 2016
ALTER DATABASE RECOVER  datafile 2
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 1 Seq 18688 Reading mem 0
  Mem# 0: E:\ORACLE_DATA\YCCY\REDO01.LOG
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x2E7FFFFFE] [PC:0x74CAE3F0, 0000000074CAE3F0]
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_ora_3508.trc  (incident=143022):
ORA-07445: 出现异常错误: 核心转储 [PC:0x74CAE3F0] [ACCESS_VIOLATION] [ADDR:0x2E7FFFFFE] [PC:0x74CAE3F0] [UNABLE_TO_READ] []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_143022\yccy_ora_3508_i143022.trc
Errors in file d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_143022\yccy_ora_3508_i143022.trc:
ORA-00607: 当更改数据块时出现内部错误
ORA-00602: 内部编程异常错误
ORA-07445: 出现异常错误: 核心转储 [PC:0x74CAE3F0] [ACCESS_VIOLATION] [ADDR:0x2E7FFFFFE] [PC:0x74CAE3F0] [UNABLE_TO_READ] []

对于file 5处理

SQL> recover datafile 5;
ORA-00283: 恢复会话因错误而取消
ORA-00600: 内部错误代码, 参数: [3020], [5], [163457], [21134977], [], [], [],
[], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 163457, file
offset is 1339039744 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: 数据文件 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
SQL> recover  datafile 5 allow 1 corruption;
ORA-00283: 恢复会话因错误而取消
ORA-00600: 内部错误代码, 参数: [3020], [5], [162433], [21133953], [], [], [],
[], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 162433, file
offset is 1330651136 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: 数据文件 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
SQL> recover  datafile 5 allow 1 corruption;
ORA-00283: 恢复会话因错误而取消
ORA-00600: 内部错误代码, 参数: [3020], [5], [166272], [21137792], [], [], [],
[], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 166272, file
offset is 1362100224 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: 数据文件 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
SQL> recover  datafile 5 allow 1 corruption;
ORA-00283: 恢复会话因错误而取消
ORA-00600: 内部错误代码, 参数: [3020], [5], [169346], [21140866], [], [], [],
[], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 169346, file
offset is 1387282432 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: 数据文件 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
SQL> recover  datafile 5 allow 1 corruption;
完成介质恢复。

open数据库并online datafile 2

SQL> startup pfile='d:/pfile.txt' mount;
ORACLE 例程已经启动。
Total System Global Area 1.7103E+10 bytes
Fixed Size                  2192864 bytes
Variable Size            9059699232 bytes
Database Buffers         8019509248 bytes
Redo Buffers               21762048 bytes
数据库装载完毕。
SQL> alter database datafile 2 offline;
数据库已更改。
SQL> alter database open;
数据库已更改。
SQL> shutdown immediate;
ORA-03113: 通信通道的文件结尾
SQL> conn / as sysdba
已连接到空闲例程。
SQL> startup pfile='d:/pfile.txt' mount;
ORACLE 例程已经启动。
Total System Global Area 1.7103E+10 bytes
Fixed Size                  2192864 bytes
Variable Size            9059699232 bytes
Database Buffers         8019509248 bytes
Redo Buffers               21762048 bytes
数据库装载完毕。
SQL> select group#,status from v$log;
    GROUP# STATUS
---------- ----------------
         1 INACTIVE
         3 INACTIVE
         2 CURRENT
SQL> recover database until cancel;
ORA-00279: 更改 1226478477 (在 03/28/2016 20:23:37 生成) 对于线程 1 是必需的
ORA-00289: 建议:
D:\ORACLE\FLASH_RECOVERY_AREA\YCCY\ARCHIVELOG\2016_03_28\O1_MF_1_18689_%U_.ARC
ORA-00280: 更改 1226478477 (用于线程 1) 在序列 #18689 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\ORACLE_DATA\YCCY\REDO02.LOG
已应用的日志。
完成介质恢复。
SQL> alter database datafile 2 online;
数据库已更改。
SQL> alter database open resetlogs;
数据库已更改。

数据库基本上属于正常打开,处理掉3020部分的坏块基本ok

ORA-01172 ORA-01151 故障恢复

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：ORA-01172 ORA-01151 故障恢复

有客户存储异常断电,导致数据库启动报ORA-01172错,导致数据库无法open
数据库启动报ORA-01172错误

Wed Mar 23 14:16:23 2016
ALTER DATABASE OPEN
Wed Mar 23 14:16:24 2016
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Wed Mar 23 14:16:24 2016
Started redo scan
Wed Mar 23 14:16:25 2016
Completed redo scan
 62588 redo blocks read, 15 data blocks need recovery
Wed Mar 23 14:16:25 2016
Started redo application at
 Thread 1: logseq 15050, block 2, scn 2439828667
Wed Mar 23 14:16:25 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 15050 Reading mem 0
  Mem# 0 errs 0: /oracle/oradata/orcl/redo01.log
Wed Mar 23 14:16:25 2016
Completed redo application
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 26185 OF FILE 3
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 69385 OF FILE 3
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 566 OF FILE 2
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 89 OF FILE 2
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 53769 OF FILE 3
Wed Mar 23 14:16:26 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p012_6540.trc:
ORA-01172: recovery of thread 1 stuck at block 566 of file 2
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:26 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p008_6532.trc:
ORA-01172: recovery of thread 1 stuck at block 53769 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:26 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p011_6538.trc:
ORA-01172: recovery of thread 1 stuck at block 69385 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:26 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p005_6526.trc:
ORA-01172: recovery of thread 1 stuck at block 26185 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:27 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p014_6544.trc:
ORA-01172: recovery of thread 1 stuck at block 89 of file 2
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:27 2016
Aborting crash recovery due to slave death, attempting serial crash recovery
Wed Mar 23 14:16:27 2016
Beginning crash recovery of 1 threads
Wed Mar 23 14:16:27 2016
Started redo scan
Wed Mar 23 14:16:27 2016
Completed redo scan
 62588 redo blocks read, 15 data blocks need recovery
Wed Mar 23 14:16:27 2016
Started redo application at
 Thread 1: logseq 15050, block 2, scn 2439828667
Wed Mar 23 14:16:27 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 15050 Reading mem 0
  Mem# 0 errs 0: /oracle/oradata/orcl/redo01.log
RECOVERY OF THREAD 1 STUCK AT BLOCK 566 OF FILE 2
Wed Mar 23 14:16:27 2016
Aborting crash recovery due to error 1172
Wed Mar 23 14:16:27 2016
Errors in file /oracle/admin/orcl/udump/orcl_ora_6514.trc:
ORA-01172: recovery of thread 1 stuck at block 566 of file 2
ORA-01151: use media recovery to recover block, restore backup if needed
ORA-1172 signalled during: ALTER DATABASE OPEN...

ALTER DATABASE RECOVER datafile 1 报错
尝试recover datafile 1之后报ORA-600 kcbrapply_4,ORA-600 kcfrbd_3,ORA-600 kcbrapply_12等错误,从报错信息看,出现这些错误的原因,是由于断电导致坏块引起.

Thu Mar 24 21:50:18 2016
ALTER DATABASE RECOVER  datafile 1
Thu Mar 24 21:50:18 2016
Media Recovery Start
 parallel recovery started with 15 processes
Thu Mar 24 21:50:18 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 15050 Reading mem 0
  Mem# 0 errs 0: /oracle/oradata/orcl/redo01.log
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p004_13391.trc:
ORA-00600: internal error code, arguments: [kcbrapply_4], [2], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p010_13403.trc:
ORA-00600: internal error code, arguments: [kcbrapply_4], [0], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p000_13383.trc:
ORA-00600: internal error code, arguments: [kcbrapply_4], [0], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p009_13401.trc:
ORA-00600: internal error code, arguments: [kcbrapply_4], [3], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Hex dump of (file 1, block 61562) in trace file /oracle/admin/orcl/bdump/orcl_p001_13385.trc
Corrupt block relative dba: 0x0040f07a (file 1, block 61562)
Bad header found during media recovery
Data in bad block:
 type: 0 format: 0 rdba: 0xf07a0000
 last change scn: 0x916c.dc4b0040 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0xb088
 consistency value in tail: 0x06010fc1
 check value in block header: 0x601
 block checksum disabled
Thu Mar 24 21:50:19 2016
Hex dump of (file 1, block 55706) in trace file /oracle/admin/orcl/bdump/orcl_p014_13411.trc
Corrupt block relative dba: 0x0040d99a (file 1, block 55706)
Bad header found during media recovery
Data in bad block:
 type: 0 format: 0 rdba: 0xd99a0000
 last change scn: 0x916c.e1ad0040 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0xa520
 consistency value in tail: 0x06012222
 check value in block header: 0x601
 block checksum disabled
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p006_13395.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [3342335], [1], [0], [64000], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p003_13389.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [3932159], [1], [0], [64000], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p002_13387.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [2293759], [1], [0], [64000], [], []
Reread of rdba: 0x0040d99a (file 1, block 55706) found valid data
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p014_13411.trc:
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Reread of rdba: 0x0040f07a (file 1, block 61562) found valid data
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13385.trc:
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p014_13411.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9782BF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p006_13395.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9C82BF4] [] []
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [3342335], [1], [0], [64000], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p009_13401.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9A02BF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_4], [3], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p003_13389.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9F02AF4] [] []
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [3932159], [1], [0], [64000], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p004_13391.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xBA182AF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_4], [2], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p010_13403.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xBA402AF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_4], [0], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p000_13383.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9282AF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_4], [0], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13385.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9C82AF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p002_13387.trc:
ORA-10562: Error occurred while applying redo to data block (file# 1, block# 11042)
ORA-10564: tablespace SYSTEM
ORA-01110: data file 1: '/oracle/oradata/orcl/system01.dbf'
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [2293759], [1], [0], [64000], [], []

ALTER DATABASE RECOVER datafile 3 报错
该文件恢复主要报ORA-600 kcbrsearchflist_2,ORA-600 kdxlin:psno out of range,ORA-600 kcbs_dump_adv_state等错误

Thu Mar 24 21:52:04 2016
ALTER DATABASE RECOVER  datafile 3
Thu Mar 24 21:52:04 2016
Media Recovery Start
 parallel recovery started with 15 processes
Thu Mar 24 21:52:04 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 15050 Reading mem 0
  Mem# 0 errs 0: /oracle/oradata/orcl/redo01.log
Thu Mar 24 21:52:05 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p007_13462.trc:
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []
Thu Mar 24 21:52:05 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13450.trc:
ORA-00600: internal error code, arguments: [kcbrsearchflist_2], [], [], [], [], [], [], []
Thu Mar 24 21:52:05 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p007_13462.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9F076F4] [] []
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []
Thu Mar 24 21:52:05 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13450.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9C874F4] [] []
ORA-00600: internal error code, arguments: [kcbrsearchflist_2], [], [], [], [], [], [], []
Thu Mar 24 21:52:06 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p007_13462.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9F066F4] [] []
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9F076F4] [] []
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []
Thu Mar 24 21:52:06 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13450.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9C864F4] [] []
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9C874F4] [] []
ORA-00600: internal error code, arguments: [kcbrsearchflist_2], [], [], [], [], [], [], []

恢复过程

SQL> startup mount
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size          1220432 bytes
Variable Size             369098928 bytes
Database Buffers         1761607680 bytes
Redo Buffers               15556608 bytes
Database mounted.
SQL> select file# from v$datafile;
     FILE#
----------
         1
         2
         3
         4
         5
         6
6 rows selected.
SQL> recover datafile 1;
ORA-03113: end-of-file on communication channel
SQL> startup mount
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size          1220432 bytes
Variable Size             369098928 bytes
Database Buffers         1761607680 bytes
Redo Buffers               15556608 bytes
Database mounted.
SQL> recover datafile 3;
ORA-03113: end-of-file on communication channel
SQL> startup mount
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size          1220432 bytes
Variable Size             369098928 bytes
Database Buffers         1761607680 bytes
Redo Buffers               15556608 bytes
Database mounted.
SQL> recover datafile 5;
Media recovery complete.
SQL> recover datafile 6;
Media recovery complete.
SQL> recover datafile 4;
Media recovery complete.
SQL> recover datafile 2;
Media recovery complete.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [57], [11], [], [], [], [],
[]
SQL> select open_mode from v$database;
OPEN_MODE
----------
READ WRITE

这次运气不错,system坏的是mon_mods$,undo异常可以重建,基本上可以说没有数据丢失,数据库恢复完成.
重要的库,通过open过程报错信息,分析可能的坏块所属对象,然后确定处理方法,以免造成永久性数据块损坏.

Solaris rm datafile recovery—利用句柄误删除数据文件恢复

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：Solaris rm datafile recovery—利用句柄误删除数据文件恢复

今天早上接到有客户恢复请求，他一不小心在solaris系统中使用rm -rf oradata命令把一个分区下面的所有数据文件全部删除了。现在不知道怎么办,请求我们给予支持.上去检查发现比较幸运,数据库没有crash,看来直接考虑使用句柄进行恢复

root@CNISORCLSVR # uname -a
SunOS CNISORCLSVR 5.9 Generic_112233-08 sun4u sparc SUNW,Sun-Fire-880
root@CNISORCLSVR # ps -ef|grep lgwr
  oracle   597     1  0   Mar 05 ?       17:14 ora_lgwr_xifenfei
    root 28069 28043  0 18:51:17 pts/2    0:00 grep lgwr
root@CNISORCLSVR # ls -ltr
total 189348454
-r--r--r--   1 oracle   dba       657920 Apr 26  2002 12
c---------   1 root     sys       13, 12 Mar 27  2004 8
c---------   1 root     sys       13, 12 Mar 27  2004 10
-rw-r-----   0 oracle   dba      34359730176 Nov 12  2013 291
-rw-r-----   0 oracle   dba      1073750016 Nov 13  2013 293
D---------   1 root     root           0 Mar  5 19:31 11
-rw-r-----   1 oracle   dba         1758 Mar  5 22:04 9
-rw-rw----   1 oracle   dba           24 Mar  5 22:04 13
s---------   0 root     root           0 Mar  8 00:45 14
-rw-r-----   1 oracle   dba      1887444992 Mar 12 03:27 289
-rw-r-----   1 oracle   dba      943726592 Mar 12 11:17 290
-rw-r-----   0 oracle   dba      4294975488 Mar 13 00:09 292
-rw-r-----   0 oracle   dba      268443648 Mar 13 01:33 288
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 279
-rw-r-----   1 oracle   dba      134225920 Mar 13 01:33 278
-rw-r-----   0 oracle   dba      134225920 Mar 13 01:33 269
-rw-r-----   1 oracle   dba      268443648 Mar 13 01:33 267
-rw-r-----   1 oracle   dba      148119552 Mar 13 01:33 266
-rw-r-----   1 oracle   dba      10493952 Mar 13 01:33 265
-rw-r-----   1 oracle   dba      26222592 Mar 13 01:33 264
-rw-r-----   1 oracle   dba      62922752 Mar 13 01:33 263
-rw-r-----   1 oracle   dba      20979712 Mar 13 01:33 262
-rw-r-----   0 oracle   dba      134225920 Mar 13 01:33 287
-rw-r-----   1 oracle   dba      209723392 Mar 13 01:33 285
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 283
-rw-r-----   1 oracle   dba      67117056 Mar 13 01:33 282
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 281
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 280
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 276
-rw-r-----   0 oracle   dba      1073750016 Mar 13 01:33 275
-rw-r-----   0 oracle   dba      2214600704 Mar 13 01:33 274
-rw-r-----   0 oracle   dba      134225920 Mar 13 01:33 273
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 272
c---------   1 root     sys       13,  2 Mar 13 02:00 5
c---------   1 root     sys       13,  2 Mar 13 02:00 4
c---------   1 root     sys       13,  2 Mar 13 02:00 3
c---------   1 root     sys       13,  2 Mar 13 02:00 2
c---------   1 root     sys       13,  2 Mar 13 02:00 1
c---------   1 root     sys       13,  2 Mar 13 02:00 0
--w-------   1 oracle   dba      4640842 Mar 13 04:43 7
--w-------   1 oracle   dba      4640842 Mar 13 04:43 6
-rw-r-----   0 oracle   dba      1207967744 Mar 13 18:21 271
-rw-r-----   0 oracle   dba      15929974784 Mar 13 18:39 284
-rw-r-----   0 oracle   dba      134225920 Mar 13 18:45 277
-rw-r-----   0 oracle   dba      2122326016 Mar 13 18:46 286
-rw-r-----   0 oracle   dba      9261031424 Mar 13 18:47 270
-rw-r-----   0 oracle   dba      18253619200 Mar 13 18:47 268
-rw-r-----   1 oracle   dba      134225920 Mar 13 18:51 261
-rw-r-----   1 oracle   dba      524296192 Mar 13 18:51 260
-rw-r-----   1 oracle   dba      104858112 Mar 13 18:52 259
-rw-r-----   1 oracle   dba      1941504 Mar 13 18:52 258
-rw-r-----   1 oracle   dba      1941504 Mar 13 18:52 257
-rw-r-----   1 oracle   dba      1941504 Mar 13 18:52 256
SQL> select file#,name from v$datafile wehre name like '/disk%';
     FILE# NAME
---------- --------------------------------------------------
         9 /disk/oradata/xifenfei/xifenfei.dbf
        10 /disk/oradata/xifenfei/CSSN.dbf
        11 /disk/oradata/xifenfei/NCSSN.dbf
        12 /disk/oradata/xifenfei/CSIC_RDS.dbf
        13 /disk/oradata/xifenfei/CSIC_CSSN.dbf
        14 /disk/oradata/xifenfei/CNIS_I.dbf
        15 /disk/oradata/xifenfei/CNIS.dbf
        16 /disk/oradata/xifenfei/TRSWCM6_CSSN.dbf
        17 /disk/oradata/xifenfei/TRSWCM6_CSSN_PLUGINS.dbf
        18 /disk/oradata/xifenfei/DIGIREF.dbf
        20 /disk/oradata/xifenfei/TRSWCM.dbf
        21 /disk/oradata/xifenfei/TRSWCM52_NSLC.dbf
        22 /disk/oradata/xifenfei/TRSWCM52_PLUGINS_NSLC.dbf
        24 /disk/oradata/xifenfei/TRSWCM_PLUGINS.dbf
        25 /disk/oradata/xifenfei/CNIS_ALL.dbf
        27 /disk/oradata/xifenfei/undotbs01.dbf
        28 /disk/oradata/xifenfei/TRS_IDS02.dbf
        29 /disk/oradata/xifenfei/xdb02.dbf

在solaris中比较郁闷,虽然进入了fd目录,但是无法知道哪些文件句柄是删除,哪些是正常的,因此没有办法，只能使用lsof进一步分析

root@CNISORCLSVR # pkgadd -d lsof-4.80-sol9-sparc-local
The following packages are available:
  1  IBMlsof     lsof
                 (sparc) 4.80
Select package(s) you wish to process (or 'all' to process
all packages). (default: all) [?,??,q]: all
Processing package instance <IBMlsof> from </tmp/lsof-4.80-sol9-sparc-local>
lsof
(sparc) 4.80
Vic Abell
Using </usr/local> as the package base directory.
## Processing package information.
## Processing system information.
## Verifying disk space requirements.
## Checking for conflicts with packages already installed.
The following files are already installed on the system and are being
used by another package:
* /usr/local/bin <attribute change only>
* - conflict with a file which does not belong to any package.
Do you want to install these conflicting files [y,n,?,q] y
## Checking for setuid/setgid programs.
The following files are being installed with setuid and/or setgid
permissions:
 /usr/local/bin/lsof <setgid sys>
 /usr/local/bin/sparcv7/lsof <setgid sys>
 /usr/local/bin/sparcv9/lsof <setgid sys>
Do you want to install these as setuid/setgid files [y,n,?,q] y
Installing lsof as <IBMlsof>
## Installing part 1 of 1.
/usr/local/bin/lsof
/usr/local/bin/sparcv7/lsof
/usr/local/bin/sparcv9/lsof
/usr/local/doc/lsof/00.README.FIRST
/usr/local/doc/lsof/00CREDITS
/usr/local/doc/lsof/00DCACHE
/usr/local/doc/lsof/00DIALECTS
/usr/local/doc/lsof/00DIST
/usr/local/doc/lsof/00FAQ
/usr/local/doc/lsof/00LSOF-L
/usr/local/doc/lsof/00MANIFEST
/usr/local/doc/lsof/00PORTING
/usr/local/doc/lsof/00QUICKSTART
/usr/local/doc/lsof/00README
/usr/local/doc/lsof/00TEST
/usr/local/doc/lsof/00XCONFIG
/usr/local/doc/lsof/lsof.man
/usr/local/man/man8/lsof.8
[ verifying class <none> ]
Installation of <IBMlsof> was successful.
root@CNISORCLSVR # ./lsof -p 597
COMMAND PID   USER   FD   TYPE        DEVICE    SIZE/OFF   NODE NAME
oracle  597 oracle  cwd   VDIR          85,5        2048 106299 /export/home/oracle/app/product/9.2.0/dbs
oracle  597 oracle  txt   VREG          85,5    61272672   2332 /export/home/oracle/app/product/9.2.0/bin/oracle
…………
oracle  597 oracle  260u  VREG          85,5   524296192 106517 /export/home/oracle/oradata/xifenfei/system01.dbf
oracle  597 oracle  261u  VREG          85,5   134225920 106518 /export/home/oracle/oradata/xifenfei/undotbs01.dbf
…………
oracle  597 oracle  268u  VREG        118,70 18253619200  109 /disk (/dev/dsk/c2t600A0B800029CEFA0000036C491B270Bd0s6)
oracle  597 oracle  269u  VREG        118,70   134225920  110 /disk (/dev/dsk/c2t600A0B800029CEFA0000036C491B270Bd0s6)
oracle  597 oracle  270u  VREG        118,70  9261031424  111 /disk (/dev/dsk/c2t600A0B800029CEFA0000036C491B270Bd0s6)
…………
oracle  597 oracle  293u  VREG        118,70  1073750016   14 /disk (/dev/dsk/c2t600A0B800029CEFA0000036C491B270Bd0s6)

到这一步,基本上定位/disk部分是我们需要恢复的数据,从而可以定位到句柄,然后结合数据文件信息,直接使用cp命令恢复出来文件.然后数据库层面recover并且online.

cd /proc/597/fd
cp 269 /disk/oradata/cnisora2/CSSN.dbf
chown oracle:dba /disk/oradata/xifenfei/CSSN.dbf
SQL> recover datafile 10;
ORA-00283: 恢复会话因错误而取消
ORA-01124: 无法恢复数据文件 10 - 文件在使用中或在恢复中
ORA-01110: 数据文件 10: '/disk/oradata/xifenfei/CSSN.dbf'
SQL> alter database datafile 10 offline;
数据库已更改。
SQL> recover datafile 10;
完成介质恢复。
SQL> alter database datafile 10 online;
数据库已更改。

分类目录归档：Oracle备份恢复