ORA-28040: No matching authentication protocol

电脑上面安装了三个版本的数据库10.2.0.3,11.2.0.1,12.1.0.2版本,使用他们分别尝试连接另外一个12.2.0.3的环境数据库发现只有12.1的版本客户端可以连接到12.2上面,其他版本报ORA-28040错误
分别测试连接,报ORA-28040错误

C:\Users\XIFENFEI>sqlplus sys/oracle@192.168.137.121/orcl12c2 as sysdba
SQL*Plus: Release 12.1.0.2.0 Production on 星期三 7月 20 00:03:01 2016
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
连接到:
Oracle Database 12c Enterprise Edition Release 12.2.0.0.3 - 64bit Production
SQL>
SQL>
C:\Users\XIFENFEI>D:\app\FAL\product\11.2.0\dbhome_1\bin\sqlplus sys/oracle@192.168.137.121/orcl12c2 as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on 星期三 7月 20 00:10:33 2016
Copyright (c) 1982, 2010, Oracle.  All rights reserved.
ERROR:
ORA-28040: No matching authentication protocol
C:\Users\XIFENFEI>D:\app\product\10.2.0\db_1\bin\sqlplus sys/oracle@192.168.137.121/orcl12c2 as sysdba
SQL*Plus: Release 10.2.0.3.0 - Production on 星期三 7月 20 00:09:30 2016
Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.
ERROR:
ORA-28040: 没有匹配的验证协议
请输入用户名:

ORA-28040错误说明

28040, 0000, "No matching authentication protocol"
// *Cause:  There was no acceptable authentication protocol for
//          either client or server.
// *Action: The administrator should set the values of the
//          SQLNET.ALLOWED_LOGON_VERSION_SERVER and
//          SQLNET.ALLOWED_LOGON_VERSION_CLIENT parameters, on both the
//          client and on the server, to values that match the minimum
//          version software supported in the system.
//          This error is also raised when the client is authenticating to
//          a user account which was created without a verifier suitable for
//          the client software version. In this situation, that account's
//          password must be reset, in order for the required verifier to
//          be generated and allow authentication to proceed successfully.

解决方法
在服务端的sqlnet.ora文件中加入上如下信息,然后重启监听

[oracle@ora1221 admin]$ vi sqlnet.ora
SQLNET.ALLOWED_LOGON_VERSION_CLIENT=8
SQLNET.ALLOWED_LOGON_VERSION_SERVER=8
[oracle@ora1221 admin]$ lsnrctl stop
LSNRCTL for Linux: Version 12.2.0.0.3 - Production on 17-JUN-2016 06:36:13
Copyright (c) 1991, 2016, Oracle.  All rights reserved.
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
The command completed successfully
[oracle@ora1221 admin]$ lsnrctl start
LSNRCTL for Linux: Version 12.2.0.0.3 - Production on 17-JUN-2016 06:36:17
Copyright (c) 1991, 2016, Oracle.  All rights reserved.
Starting /u01/app/oracle/product/12.2.0/db_2/bin/tnslsnr: please wait...
TNSLSNR for Linux: Version 12.2.0.0.3 - Production
Log messages written to /u01/app/oracle/diag/tnslsnr/ora1221/listener/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=ora1221)(PORT=1521)))
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 12.2.0.0.3 - Production
Start Date                17-JUN-2016 06:36:17
Uptime                    0 days 0 hr. 0 min. 0 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Log File         /u01/app/oracle/diag/tnslsnr/ora1221/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=ora1221)(PORT=1521)))
The listener supports no services
The command completed successfully

sqlnet中参数说明
SQLNET.ALLOWED_LOGON_VERSION_SERVER 是服务端参数对于jdbc和oci都生效,该参数不是只具体数据库版本,而是指授权协议的版本
SQLNET.ALLOWED_LOGON_VERSION_CLIENT 是指作为客户端连接其他实例的时候生效,也是只授权协议版本,而且该参数只对oci生效,jdbc 需要通过在代码中类似实现

OracleDataSource ods = new OracleDataSource();
ods.setURL(jdbcURL);
ods.setUser("scott");
ods.setPassword("tiger");
Properties props = new Properties();
props.put("oracle.jdbc.allowedLogonVersion", 12);
ods.setConnectionProperties(props);
Connection con = ods.getConnection();

上述两个参数可以填写值
12a for Oracle Database 12c release 12.1.0.2 or later authentication protocols (strongest protection)
12 for the critical patch updates CPUOct2012 and later Oracle Database 11g authentication protocols (recommended)
11 for Oracle Database 11g authentication protocols (default)
10 for Oracle Database 10g authentication protocols
9 for Oracle9i Database authentication protocol
8 for Oracle8i Database authentication protocol
allowed_logon_version_server


具体描述请见:http://docs.oracle.com/database/121/NETRF/sqlnet.htm#NETRF2010

再次测试连接

C:\Users\XIFENFEI>D:\app\FAL\product\11.2.0\dbhome_1\bin\sqlplus sys/oracle@192.168.137.121/orcl12c2 as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on 星期三 7月 20 00:20:21 2016
Copyright (c) 1982, 2010, Oracle.  All rights reserved.
连接到:
Oracle Database 12c Enterprise Edition Release 12.2.0.0.3 - 64bit Production
SQL> exit
从 Oracle Database 12c Enterprise Edition Release 12.2.0.0.3 - 64bit Production 断开
C:\Users\XIFENFEI>D:\app\product\10.2.0\db_1\bin\sqlplus sys/oracle@192.168.137.121/orcl12c2 as sysdba
SQL*Plus: Release 10.2.0.3.0 - Production on 星期三 7月 20 00:20:28 2016
Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.
连接到:
Oracle Database 12c Enterprise Edition Release 12.2.0.0.3 - 64bit Production
SQL>
C:\Users\XIFENFEI>sqlplus sys/oracle@192.168.137.121/orcl12c2 as sysdba
SQL*Plus: Release 12.1.0.2.0 Production on 星期三 7月 20 00:20:55 2016
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
连接到:
Oracle Database 12c Enterprise Edition Release 12.2.0.0.3 - 64bit Production

该问题在jdbc中也表现明显,建议参考Starting With Oracle JDBC Drivers (文档 ID 401934.1)和Client / Server Interoperability Support Matrix for Different Oracle Versions (文档 ID 207303.1)选择完全兼容性的客户端和jdbc版本,另外可以关注相关文章:
ORA-28040 and SQLNET.ALLOWED_LOGON_VERSION_CLIENT for JDBC Thin Clients (文档 ID 2000339.1)
ORA-28040 Using JDBC Connection to 12c Database (文档 ID 2111118.1)
JDBC Version 10.2.0.4 Produces ORA-28040 Connecting To Oracle 12c (12.1.0.2) Database (文档 ID 2023160.1)
ORA-28040 and SQLNET.ALLOWED_LOGON_VERSION_CLIENT for JDBC Thin Clients (文档 ID 2000339.1)

Oracle 12c undo异常处理—root pdb undo异常

在12c pdb环境中如果root pdb的undo文件异常,数据库该如何恢复呢?这篇文章模拟undo丢失的情况下进行恢复
模拟环境
三个会话,其中第一个会话对pdb1中的表进行操作,并且有事务未提交,第二个会话对pdb2进行操作,也未提交事务;第三个会话直接abort库,模拟突然库异常,然后删除root pdb下面的undo文件

--会话1
[oracle@ora1221 oradata]$ sqlplus  / as sysdba
SQL*Plus: Release 12.2.0.0.3 Production on Thu Jun 16 22:24:20 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
Database opened.
SQL>
SQL>
SQL>
SQL> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           MOUNTED
         4 PDB2                           MOUNTED
SQL> alter session set container=pdb1;
Session altered.
SQL> alter database open;
Database altered.
SQL>  create user chf identified by oracle;
User created.
SQL> grant dba to chf;
Grant succeeded.
SQL> create table chf.t_xifenfei_p1 as
  2  select * from dba_objects;
Table created.
SQL> insert into chf.t_xifenfei_p1
  2  select * from dba_objects;
72427 rows created.
SQL> select count(*) from chf.t_xifenfei_p1;
  COUNT(*)
----------
    144853
--会话2
[oracle@ora1221 ~]$ ss
SQL*Plus: Release 12.2.0.0.3 Production on Thu Jun 16 22:34:01 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.0.3 - 64bit Production
SQL> alter session set container=pdb2;
Session altered.
SQL> alter database open;
Database altered.
SQL>  create user chf identified by oracle;
User created.
SQL> grant dba to chf;
Grant succeeded.
SQL>  create table chf.t_xifenfei_p2
  2  as select * from dba_objects;
Table created.
SQL> delete from chf.t_xifenfei_p2;
72426 rows deleted.
SQL> select count(*) from chf.t_xifenfei_p2;
  COUNT(*)
----------
    0
--会话3
[oracle@ora1221 ~]$ ss
SQL*Plus: Release 12.2.0.0.3 Production on Thu Jun 16 22:36:16 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.0.3 - 64bit Production
SQL> shutdown abort
ORACLE instance shut down.
--删除cdb undo文件
[oracle@ora1221 orcl12c2]$ ls -ltr
total 2040912
drwxr-x---. 2 oracle oinstall      4096 Jun 16 18:26 pdbseed
drwxr-x---. 2 oracle oinstall      4096 Jun 16 18:27 pdb2
drwxr-x---. 2 oracle oinstall      4096 Jun 16 18:28 pdb1
-rw-r-----. 1 oracle oinstall 209715712 Jun 16 22:24 redo03.log
-rw-r-----. 1 oracle oinstall   5251072 Jun 16 22:24 users01.dbf
-rw-r-----. 1 oracle oinstall  34611200 Jun 16 22:25 temp01.dbf
-rw-r-----. 1 oracle oinstall 849354752 Jun 16 22:35 system01.dbf
-rw-r-----. 1 oracle oinstall  73408512 Jun 16 22:35 undotbs01.dbf
-rw-r-----. 1 oracle oinstall 492838912 Jun 16 22:35 sysaux01.dbf
-rw-r-----. 1 oracle oinstall 209715712 Jun 16 22:36 redo02.log
-rw-r-----. 1 oracle oinstall 209715712 Jun 16 22:36 redo01.log
-rw-r-----. 1 oracle oinstall  18726912 Jun 16 22:36 control02.ctl
-rw-r-----. 1 oracle oinstall  18726912 Jun 16 22:36 control01.ctl
[oracle@ora1221 orcl12c2]$ rm undotbs01.dbf
[oracle@ora1221 orcl12c2]$ ls -l un*
ls: cannot access un*: No such file or directory

启动数据库
由于有undo文件丢失数据库在启动的时候检测到文件丢失(ORA-01157),无法open,offline文件后依旧无法启动(ORA-00376)

[oracle@ora1221 orcl12c2]$ ss
SQL*Plus: Release 12.2.0.0.3 Production on Thu Jun 16 22:51:21 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 4 - see DBWR trace file
ORA-01110: data file 4: '/u01/app/oracle/oradata/orcl12c2/undotbs01.dbf'
offline 数据文件
SQL> alter database datafile 4 offline ;
alter database datafile 4 offline
*
ERROR at line 1:
ORA-01145: offline immediate disallowed unless media recovery enabled
SQL>  alter database datafile 4 offline drop;
Database altered.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 4 cannot be read at this time
ORA-01110: data file 4: '/u01/app/oracle/oradata/orcl12c2/undotbs01.dbf'
Process ID: 7547
Session ID: 16 Serial number: 56234

把undo_management修改为manual后启动库,依旧报ORA-00376

SQL> startup pfile='/tmp/pfile' mount;
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
SQL> show parameter undo_management;
NAME                                 TYPE
------------------------------------ ----------------------
VALUE
------------------------------
undo_management                      string
MANUAL
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 4 cannot be read at this time
ORA-01110: data file 4: '/u01/app/oracle/oradata/orcl12c2/undotbs01.dbf'
Process ID: 7981
Session ID: 16 Serial number: 56572

设置_corrupted_rollback_segments参数

SQL> startup pfile='/tmp/pfile' mount;
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
SQL> show parameter _corrupted_rollback_segments;
NAME                                 TYPE
------------------------------------ ----------------------
VALUE
------------------------------
_corrupted_rollback_segments         string
_SYSSMU1_3200770482$, _SYSSMU2
_3597554035$, _SYSSMU3_2898427
493$, _SYSSMU4_670955920$, _SY
SSMU5_1233449977$, _SYSSMU6_32
67641983$, _SYSSMU7_2822479342
$, _SYSSMU8_1645196706$, _SYSS
MU9_3032014485$, _SYSSMU10_474
465626$
SQL> alter database open;
Database altered.

通过设置_corrupted_rollback_segments参数之后,数据库正常启动,下面继续其他pdb

open pdb1

SQL> alter session set container=pdb1;
Session altered.
SQL> alter database open;
Database altered.
SQL> select count(*) from chf.t_xifenfei_p1;
  COUNT(*)
----------
     72426

pdb2 open

SQL> alter session set container=pdb2;
Session altered.
SQL> alter database open;
Database altered.
SQL> select count(*) from chf.t_xifenfei_p2;
  COUNT(*)
----------
     72426

至此数据库基本上恢复完成,但是看到的pdb里面两个测试表的数据和我们预测的有一定的偏差,看来cdb中的undo和pdb中的undo还是有一定的依赖关系.同时也说明了root的undo异常对于其他pdb的open最少在恢复上面影响不大.下一篇测试业务pdb中undo异常处理

Oracle 12c redo 丢失恢复

模拟redo丢失
对数据库的一个pdb模拟事务操作,然后abort库,并且删除所有redo,模拟生产环境redo丢失的case

[oracle@ora1221 oradata]$ ss
SQL*Plus: Release 12.2.0.0.3 Production on Wed Jun 15 10:13:20 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
Database opened.
SQL>
SQL>
SQL> set pages 100
SQL> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           MOUNTED
         4 PDB2                           MOUNTED
SQL> select con_id,file#,checkpoint_change# from v$datafile_header order by 1;
    CON_ID      FILE# CHECKPOINT_CHANGE#
---------- ---------- ------------------
         1          1            1500157
         1          3            1500157
         1          4            1500157
         1          7            1500157
         2          5            1371280
         2          6            1371280
         2          8            1371280
         3          9            1499902
         3         12            1499902
         3         11            1499902
         3         10            1499902
         4         15            1499903
         4         14            1499903
         4         13            1499903
         4         16            1499903
15 rows selected.
SQL> alter PLUGGABLE database pdb1 open;
Pluggable database altered.
SQL> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO
         4 PDB2                           MOUNTED
SQL> alter session set container=pdb1;
Session altered.
SQL> create user chf identified by oracle;
User created.
SQL> grant dba to chf;
Grant succeeded.
SQL> create table chf.t_pdb1_xifenfei as select * from dba_objects;
Table created.
SQL> delete from chf.t_pdb1_xifenfei;
72426 rows deleted.
--另外一个节点
[oracle@ora1221 ~]$ ss
SQL*Plus: Release 12.2.0.0.3 Production on Wed Jun 15 10:19:21 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.0.3 - 64bit Production
SQL> shutdown abort
ORACLE instance shut down.
[oracle@ora1221 orcl12c2]$ ls redo*
redo01.log  redo02.log  redo03.log
[oracle@ora1221 orcl12c2]$ rm redo0*
[oracle@ora1221 orcl12c2]$ ls -l redo*
ls: cannot access redo*: No such file or directory

尝试启动数据库

[oracle@ora1221 orcl12c2]$ ss
SQL*Plus: Release 12.2.0.0.3 Production on Wed Jun 15 10:26:20 2016
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/u01/app/oracle/oradata/orcl12c2/redo01.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7

使用隐含参数启动

----pfile里面增加
_allow_error_simulation=TRUE
_allow_resetlogs_corruption=true
~
SQL> shutdown immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down
SQL> startup pfile='/tmp/pfile' mount
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [],
[], [], [], [], [], [], []
Process ID: 36797
Session ID: 16 Serial number: 24277

继续重启库
ORA-600 kcbzib_kcrsds_1错误尝试重启数据库,如果不行,考虑使用bbed修改文件头信息

SQL> startup mount pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
Database mounted.
SQL> recover database until cancel;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database or pluggable database must be opened in read/write
mode.
SQL> alter database backup controlfile to trace as '/tmp/ctl';
alter database backup controlfile to trace as '/tmp/ctl'
*
ERROR at line 1:
ORA-16433: The database or pluggable database must be opened in read/write
mode.

重建控制文件

SQL> startup nomount pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  8260048 bytes
Variable Size             671090224 bytes
Database Buffers         1828716544 bytes
Redo Buffers                8515584 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE "orcl12c2" RESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 50
  3      MAXLOGMEMBERS 5
  4      MAXDATAFILES 100
  5      MAXINSTANCES 1
  6      MAXLOGHISTORY 226
  7  LOGFILE
  8    GROUP 1 '/u01/app/oracle/oradata/orcl12c2/redo01.log'  SIZE 200M,
  9    GROUP 2 '/u01/app/oracle/oradata/orcl12c2/redo02.log'  SIZE 200M,
 10    GROUP 3 '/u01/app/oracle/oradata/orcl12c2/redo03.log'  SIZE 200M
 11  DATAFILE
 12  '/u01/app/oracle/oradata/orcl12c2/system01.dbf',
 13  '/u01/app/oracle/oradata/orcl12c2/sysaux01.dbf',
 14  '/u01/app/oracle/oradata/orcl12c2/undotbs01.dbf',
 15  '/u01/app/oracle/oradata/orcl12c2/pdbseed/system01.dbf',
 16  '/u01/app/oracle/oradata/orcl12c2/pdbseed/sysaux01.dbf',
 17  '/u01/app/oracle/oradata/orcl12c2/users01.dbf',
 18  '/u01/app/oracle/oradata/orcl12c2/pdbseed/undotbs01.dbf',
 19  '/u01/app/oracle/oradata/orcl12c2/pdb1/system01.dbf',
 20  '/u01/app/oracle/oradata/orcl12c2/pdb1/sysaux01.dbf',
 21  '/u01/app/oracle/oradata/orcl12c2/pdb1/undotbs01.dbf',
 22  '/u01/app/oracle/oradata/orcl12c2/pdb1/users01.dbf',
 23  '/u01/app/oracle/oradata/orcl12c2/pdb2/system01.dbf',
 24  '/u01/app/oracle/oradata/orcl12c2/pdb2/sysaux01.dbf',
 25  '/u01/app/oracle/oradata/orcl12c2/pdb2/undotbs01.dbf',
 26  '/u01/app/oracle/oradata/orcl12c2/pdb2/users01.dbf'
 27  CHARACTER SET AL32UTF8
 28  ;
Control file created.
SQL> recover database until cancel;
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done
SQL> recover database until cancel using backup controlfile;
ORA-00279: change 1500161 generated at 06/15/2016 10:40:42 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/product/12.2.0/db_2/dbs/arch1_1_914582438.dbf
ORA-00280: change 1500161 for thread 1 is in sequence #1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/app/oracle/oradata/orcl12c2/redo01.log
Log applied.
Media recovery complete.
SQL> alter database open resetlogs;
Database altered.

<strong>open过程alert日志</strong>

<strong>open pdb1</strong>

SQL> alter PLUGGABLE database pdb1  open;
Pluggable database altered.

pdb1 open alert日志

2016-06-15T11:13:39.423057+08:00
alter PLUGGABLE database pdb1  open
PDB1(3):Autotune of undo retention is turned on.
2016-06-15T11:13:39.495559+08:00
PDB1(3):Endian type of dictionary set to little
PDB1(3):[40547] Successfully onlined Undo Tablespace 2.
PDB1(3):Undo initialization finished serial:0 start:371149831 end:371149872 diff:41 ms (0.0 seconds)
PDB1(3):Database Characterset for PDB1 is AL32UTF8
PDB1(3):*********************************************************************
PDB1(3):WARNING: The following temporary tablespaces in container(PDB1)
PDB1(3):         contain no files.
PDB1(3):         This condition can occur when a backup controlfile has
PDB1(3):         been restored.  It may be necessary to add files to these
PDB1(3):         tablespaces.  That can be done using the SQL statement:
PDB1(3):
PDB1(3):         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
PDB1(3):
PDB1(3):         Alternatively, if these temporary tablespaces are no longer
PDB1(3):         needed, then they can be dropped.
PDB1(3):           Empty temporary tablespace: TEMP
PDB1(3):*********************************************************************
PDB1(3):Opatch validation is skipped for PDB PDB1 (con_id=0)
PDB1(3):Opening pdb with no Resource Manager plan active
Pluggable database PDB1 opened read write
Completed: alter PLUGGABLE database pdb1  open

open pdb2

SQL> alter PLUGGABLE database pdb2 open;
alter PLUGGABLE database pdb2 open
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [17090], [], [], [], [], [], [], [],
[], [], [], []

分析alert日志和trace文件

--alert日志部分
PDB1(3):alter PLUGGABLE database pdb2 open
PDB1(3):ORA-65118 signalled during: alter PLUGGABLE database pdb2 open...
2016-06-15T11:28:57.439963+08:00
PDB1(3):Unified Audit: Audit record write to table failed due to ORA-25153.
Writing the audit record to OS spillover file. Please grep in the trace files for ORA-25153 for more diagnostic information.
Errors in file /u01/app/oracle/diag/rdbms/orcl12c2/orcl12c2/trace/orcl12c2_ora_40547.trc  (incident=29073) (PDBNAME=PDB1):
ORA-00600: internal error code, arguments: [17090], [], [], [], [], [], [], [], [], [], [], []
PDB1(3):Incident details in: /u01/app/oracle/diag/rdbms/orcl12c2/orcl12c2/incident/incdir_29073/orcl12c2_ora_40547_i29073.trc
PDB1(3):*****************************************************************
PDB1(3):An internal routine has requested a dump of selected redo.
PDB1(3):This usually happens following a specific internal error, when
PDB1(3):analysis of the redo logs will help Oracle Support with the
PDB1(3):diagnosis.
PDB1(3):It is recommended that you retain all the redo logs generated (by
PDB1(3):all the instances) during the past 12 hours, in case additional
PDB1(3):redo dumps are required to help with the diagnosis.
PDB1(3):*****************************************************************
2016-06-15T11:28:59.123041+08:00
PDB1(3):Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2016-06-15T11:28:59.945667+08:00
Dumping diagnostic data in directory=[cdmp_20160615112859], requested by (instance=1, osid=40547), summary=[incident=29073].
2016-06-15T11:35:59.987419+08:00
PDB1(3): alter PLUGGABLE database pdb2 open
PDB1(3):ORA-65118 signalled during:  alter PLUGGABLE database pdb2 open...
--trace部分
PARSING IN CURSOR #0x7f051a3d7650 len=118 dep=1 uid=0 oct=3 lid=0 tim=372490287736 hv=1128335472 ad='0x6ca82f00' sqlid='gu930gd1n223h'
select tablespace_name, tablespace_size, allocated_space, free_space, con_id  from cdb_temp_free_space order by con_id
END OF STMT
EXEC #0x7f051a3d7650:c=0,e=144,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2538033465,tim=372490287732
FETCH #0x7f051a3d7650:c=0,e=290,p=0,cr=14,cu=0,mis=0,r=0,dep=1,og=4,plh=2538033465,tim=372490288109
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 52 FileOperation=2 fileno=0 filetype=36 obj#=402 tim=372490288373
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 17 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490288577
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 3 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490288655
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 690 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490289365
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 6 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490289470
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 445 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490289934
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 3 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490289983
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 375 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490290374
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 6 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490290453
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 367 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490290839
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 3 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490290882
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 355 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490291252
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 4 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490291298
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 276 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490291590
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 1 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490291614
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 256 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490291879
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 2 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490291903
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 261 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490292172
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 30 FileOperation=3 fileno=0 filetype=36 obj#=402 tim=372490292225
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 934 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490293171
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 3 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490293208
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 245 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490293465
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 262 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490293755
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 1 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490293780
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 250 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490294039
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 256 FileOperation=8 fileno=1 filetype=36 obj#=402 tim=372490294323
WAIT #0x7f051bd5f870: nam='Disk file operations I/O' ela= 8 FileOperation=5 fileno=0 filetype=36 obj#=402 tim=372490294359
2016-06-15T11:36:00.055196+08:00
Incident 29074 created, dump file: /u01/app/oracle/diag/rdbms/orcl12c2/orcl12c2/incident/incdir_29074/orcl12c2_ora_40547_i29074.trc
ORA-00600: internal error code, arguments: [17090], [], [], [], [], [], [], [], [], [], [], []

从中可以判断出来是由于CDB$ROOT的未增加tempfile导致

SQL> alter tablespace temp add tempfile '/u01/app/oracle/oradata/orcl12c2/temp01.dbf' reuse;
Tablespace altered.
SQL>  alter PLUGGABLE database pdb2 open;
Pluggable database altered.

查看数据库恢复情况

SQL> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO
         4 PDB2                           READ WRITE NO
SQL> select con_id,file#,checkpoint_change#,resetlogs_change# from v$datafile_header;
    CON_ID      FILE# CHECKPOINT_CHANGE# RESETLOGS_CHANGE#
---------- ---------- ------------------ -----------------
         1          1            2500167           1500164
         1          3            2500167           1500164
         1          4            2500167           1500164
         2          5            1371280           1341067
         2          6            1371280           1341067
         1          7            2500167           1500164
         2          8            1371280           1341067
         3          9            2501017           1500164
         3         10            2501017           1500164
         3         11            2501017           1500164
         3         12            2501017           1500164
         4         13            2502748           1500164
         4         14            2502748           1500164
         4         15            2502748           1500164
         4         16            2502748           1500164
15 rows selected.

至此基本上测试完成在在cdb环境中丢失redo的恢复。在生产中,需要把temp加全,并且建议重建数据库

oradebug poke ORA-32521/ORA-32519故障解决

最近有不少朋友咨询12.1.0.2及其以后的版本使用oradebug去修改scn失败,这里做了一个测试正常情况下确实无法修改(oradebug poke报 ORA-32519 或者 ORA-32521) ,这里进行了一系列修改测试最后修改成功.
数据库版本

SQL> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production              0
PL/SQL Release 12.1.0.2.0 - Production                                                    0
CORE    12.1.0.2.0      Production                                                        0
TNS for 64-bit Windows: Version 12.1.0.2.0 - Production                                   0
NLSRTL Version 12.1.0.2.0 - Production                                                    0

oradebug poke测试

SQL> oradebug setmypid
已处理的语句
SQL> oradebug DUMPvar SGA kcsgscn_
kcslf kcsgscn_ [14C8D6270, 14C8D62A0) = 009EA333 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 4C8D5CF0 00000001
SQL> oradebug poke 0x14C8D6274 4 0x00000001
ORA-32521: 对 ORADEBUG 命令  进行语法分析时出错
--或者该提示
SQL> oradebug setmypid
已处理的语句
SQL> oradebug DUMPvar SGA kcsgscn_
kcslf kcsgscn_ [14C8D6270, 14C8D62A0) = 009EAE3D 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 4C8D5CF0 00000001
SQL> oradebug poke 0x14C8D6274 4 0x0000000a
ORA-32519: 权限不足, 无法执行 ORADEBUG 命令: execution of ORADEBUG commands is disabled for this instance

通过测试确定oradebug正常情况无法执行poke,不是提示ORA-32521就是提示ORA-32519错误导致scn无法修改.

通过一些修改之后oradebug 修改scn

SQL> select dbid, name,open_mode,
  2         created created,
  3         open_mode, log_mode,
  4         checkpoint_change# as checkpoint_change#,
  5         controlfile_type ctl_type,
  6         controlfile_created ctl_created,
  7         controlfile_change# as ctl_change#,
  8         controlfile_time ctl_time,
  9         resetlogs_change# as resetlogs_change#,
 10         resetlogs_time resetlogs_time
 11  from v$database;
      DBID NAME                                                 OPEN_MODE            CREATED        OPEN_MODE
 LOG_MODE
---------- ---------------------------------------------------- -------------------- -------------- ------------
 ------------
CHECKPOINT_CHANGE# CTL_TYP CTL_CREATED    CTL_CHANGE# CTL_TIME       RESETLOGS_CHANGE# RESETLOGS_TIME
------------------ ------- -------------- ----------- -------------- ----------------- --------------
1504692401 XIFENFEI                                             READ WRITE           16-8月 -15     READ WRITE
 ARCHIVELOG
          10407853 CURRENT 16-8月 -15        10408361 07-7月 -16                     1 16-8月 -15
SQL> select con_id,file#,checkpoint_change# from v$datafile_header;
    CON_ID      FILE# CHECKPOINT_CHANGE#
---------- ---------- ------------------
         1          1           10407853
         2          2            9457324
         1          3           10407853
         2          4            9457324
         1          5           10407853
         1          6           10407853
         3          7           10407853
         3          8           10407853
         3          9           10407853
         4         10            9559964
         4         11            9559964
         4         12            9559964
         3         13           10407853
已选择 13 行。
SQL> shutdown abort;
ORACLE 例程已经关闭。
SQL> startup
ORACLE 例程已经启动。
Total System Global Area 3221225472 bytes
Fixed Size                  3837232 bytes
Variable Size             838861520 bytes
Database Buffers         2365587456 bytes
Redo Buffers               12939264 bytes
数据库装载完毕。
SQL> oradebug setmypid
已处理的语句
SQL> oradebug DUMPvar SGA kcsgscn_
kcslf kcsgscn_ [14BE16270, 14BE162A0) = 009ED5C6 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 4BE15CF0 00000001
SQL> oradebug poke 0x14BE16274 4 0x0001
BEFORE: [14BE16274, 14BE16276) = 0000
AFTER: [14BE16274, 14BE16276) = 0001
SQL> alter database open;
数据库已更改。
SQL> select dbid, name,open_mode,
  2         created created,
  3         open_mode, log_mode,
  4         checkpoint_change# as checkpoint_change#,
  5         controlfile_type ctl_type,
  6         controlfile_created ctl_created,
  7         controlfile_change# as ctl_change#,
  8         controlfile_time ctl_time,
  9         resetlogs_change# as resetlogs_change#,
 10         resetlogs_time resetlogs_time
 11  from v$database;
      DBID NAME                                                 OPEN_MODE            CREATED        OPEN_MODE
 LOG_MODE
---------- ---------------------------------------------------- -------------------- -------------- -----------
 ------------
CHECKPOINT_CHANGE# CTL_TYP CTL_CREATED    CTL_CHANGE# CTL_TIME       RESETLOGS_CHANGE# RESETLOGS_TIME
------------------ ------- -------------- ----------- -------------- ----------------- --------------
1504692401 XIFENFEI                                             READ WRITE           16-8月 -15     READ WRITE
 ARCHIVELOG
        4305478053 CURRENT 16-8月 -15      4305478245 07-7月 -16                     1 16-8月 -15
SQL> select con_id,file#,checkpoint_change# from v$datafile_header;
    CON_ID      FILE# CHECKPOINT_CHANGE#
---------- ---------- ------------------
         1          1         4305478053
         2          2            9457324
         1          3         4305478053
         2          4            9457324
         1          5         4305478053
         1          6         4305478053
         3          7         4305478053
         3          8         4305478053
         3          9         4305478053
         4         10            9559964
         4         11            9559964
         4         12            9559964
         3         13         4305478053
已选择 13 行。
SQL> select con_id,file#,checkpoint_change# from v$datafile;
    CON_ID      FILE# CHECKPOINT_CHANGE#
---------- ---------- ------------------
         1          1         4305478053
         2          2            9457324
         1          3         4305478053
         2          4            9457324
         1          5         4305478053
         1          6         4305478053
         3          7         4305478053
         3          8         4305478053
         3          9         4305478053
         4         10            9559964
         4         11            9559964
         4         12            9559964
         3         13         4305478053
已选择 13 行。

通过上述测试证明scn已经被完美修改.证明我们已经具备了不使用bbed的情况下推进12.1.0.2版本的scn问题,为12c的一系列需要推scn的恢复提供完美技术支持.

Resize operation completed for file#

Orale 12c DataGuard环境中备库出现Resize operation completed for file# 现象
数据库版本

[oracle@ray01 ~]$ opatch lspatches
22291127;Database Patch Set Update : 12.1.0.2.160419 (22291127)
OPatch succeeded.
SQL> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production              0
PL/SQL Release 12.1.0.2.0 - Production                                                    0
CORE    12.1.0.2.0      Production                                                        0
TNS for Linux: Version 12.1.0.2.0 - Production                                            0
NLSRTL Version 12.1.0.2.0 - Production                                                    0

alert日志提示

Mon Jun 27 20:56:37 2016
Resize operation completed for file# 18, old size 25600000K, new size 30720000K
Mon Jun 27 20:56:56 2016
Archived Log entry 210 added for thread 1 sequence 286 rlc 915405135 ID 0x2316988c dest 2:
Mon Jun 27 20:57:01 2016
Primary database is in MAXIMUM PERFORMANCE mode
RFS[211]: Assigned to RFS process (PID:22867)
RFS[211]: Selected log 11 for thread 1 sequence 287 dbid 588725388 branch 915405135
Mon Jun 27 20:57:14 2016
Resize operation completed for file# 17, old size 25600000K, new size 30720000K
Mon Jun 27 07:57:15 2016
Archived Log entry 211 added for thread 1 sequence 287 ID 0x2316988c dest 1:
Mon Jun 27 20:57:15 2016
Resize operation completed for file# 3, old size 972800K, new size 983040K
Mon Jun 27 20:57:15 2016
Primary database is in MAXIMUM PERFORMANCE mode
RFS[212]: Assigned to RFS process (PID:22873)
RFS[212]: Selected log 11 for thread 1 sequence 288 dbid 588725388 branch 915405135
Mon Jun 27 20:57:15 2016
Resize operation completed for file# 3, old size 983040K, new size 1024000K
Resize operation completed for file# 3, old size 1024000K, new size 1034240K
Mon Jun 27 20:57:54 2016
Resize operation completed for file# 15, old size 25600000K, new size 30720000K
Mon Jun 27 20:58:15 2016
Resize operation completed for file# 18, old size 30720000K, new size 33554416K
Mon Jun 27 20:58:34 2016
Resize operation completed for file# 17, old size 30720000K, new size 33554416K
Mon Jun 27 20:58:54 2016
Resize operation completed for file# 15, old size 30720000K, new size 33554416K

大量Resize operation completed for file# 操作记录,给人感觉比较烦,根据多年使用oracle的经验,这种现象很可能有隐含参数或者event屏蔽,隐含参数可以猜测到,event需要查询官方资料.

查询汗resize的隐含参数

SQL> col name for a52
SQL> col value for a24
SQL> col description for a50
set linesize 150
select a.ksppinm name,b.ksppstvl value,a.ksppdesc description
  from x$ksppi a,x$ksppcv b
 where a.inst_id = USERENV ('Instance')
   and b.inst_id = USERENV ('Instance')
   and a.indx = b.indx
SQL> SQL>   2    3    4    5    6     and upper(a.ksppinm) LIKE upper('%&param%')
  7  order by name
/
  8  Enter value for param: resize
old   6:    and upper(a.ksppinm) LIKE upper('%&param%')
new   6:    and upper(a.ksppinm) LIKE upper('%resize%')
NAME                                                 VALUE                    DESCRIPTION
---------------------------------------------------- ------------------------ --------------------------------------------------
_asm_skip_resize_check                               FALSE                    skip the checking of the clients for s/w compatibi
                                                                              lity for resize
_bct_public_dba_buffer_dynresize                     2                        allow dynamic resizing of public dba buffers, zero
                                                                               to disable
_disable_file_resize_logging                         FALSE                    disable file resize logging to alert log

从这里可以发现_disable_file_resize_logging参数默认值为false,表示显示文件resize的提示,设置为true应该就可以解决该问题.

DataGuard ora-16157故障解决

由于硬件故障,需要激活备库,由于登录错误服务器对本不该激活的服务器进行了如下操作
STANDBY DATABASE FINISH操作

[oracle@app73 ~]$ ss
SQL*Plus: Release 11.2.0.4.0 Production on Sun Jun 26 23:33:19 2016
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning and Data Mining options
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Database altered.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;
Database altered.

对应的alert日志提示

Sun Jun 26 23:33:22 2016
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
Sun Jun 26 23:33:22 2016
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /Data/oracle/diag/rdbms/commentdbdg/commentdb/trace/commentdb_pr00_11871.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 47440782676
Sun Jun 26 23:33:23 2016
MRP0: Background Media Recovery process shutdown (commentdb)
Managed Standby Recovery Canceled (commentdb)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
Sun Jun 26 23:33:44 2016
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH
Attempt to do a Terminal Recovery (commentdb)
Media Recovery Start: Managed Standby Recovery (commentdb)
 started logmerger process
Sun Jun 26 23:33:44 2016
Managed Standby Recovery not using Real Time Apply
Parallel Media Recovery started with 4 slaves
Media Recovery Waiting for thread 1 sequence 21196 (in transit)
Killing 4 processes with pids 11881,11867,11869,14314 (all RFS, wait for I/O)
in order to disallow current and future RFS connections. Requested by OS process 7372
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
Terminal Recovery timestamp is '06/26/2016 23:33:48'
Terminal Recovery: applying standby redo logs.
Terminal Recovery: thread 1 seq# 21196 redo required
Terminal Recovery:
Recovery of Online Redo Log: Thread 1 Group 10 Seq 21196 Reading mem 0
  Mem# 0: /Data/oracle/oradata/commentdb/std_redo10.log
Identified End-Of-Redo (failover) for thread 1 sequence 21196 at SCN 0xffff.ffffffff
Incomplete Recovery applied until change 47440782709 time 06/26/2016 23:29:51
Media Recovery Complete (commentdb)
Terminal Recovery: Enabled archive destination LOG_ARCHIVE_DEST_2
Terminal Recovery: successful completion
Forcing ARSCN to IRSCN for TR 11:196142453
Attempt to set limbo arscn 11:196142453 irscn 11:196142453
Resetting standby activation ID 3880004483 (0xe7442b83)
Sun Jun 26 23:33:48 2016
ARCH: Archival stopped, error occurred. Will continue retrying
ORACLE Instance commentdb - Archival Error
ORA-16014: log 10 sequence# 21196 not archived, no available destinations
ORA-00312: online log 10 thread 1: '/Data/oracle/oradata/commentdb/std_redo10.log'
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH
Sun Jun 26 23:39:24 2016
RFS[4]: Assigned to RFS process 7392
RFS[4]: No connections allowed during/after terminal recovery.

很明显数据库已经挺尸mrp并且成功执行了ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH

尝试重新启动mrp应用日志

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 2.1379E+10 bytes
Fixed Size                  2262656 bytes
Variable Size            2684356992 bytes
Database Buffers         1.8656E+10 bytes
Redo Buffers               36073472 bytes
Database mounted.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION;
Database altered.

alert日志提示

Sun Jun 26 23:40:39 2016
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Attempt to start background Managed Standby Recovery process (commentdb)
Sun Jun 26 23:40:39 2016
MRP0 started with pid=24, OS id=7458
MRP0: Background Managed Standby Recovery process started (commentdb)
 started logmerger process
Sun Jun 26 23:40:44 2016
Managed Standby Recovery starting Real Time Apply
MRP0: Background Media Recovery terminated with error 16157
Errors in file /Data/oracle/diag/rdbms/commentdbdg/commentdb/trace/commentdb_pr00_7460.trc:
ORA-16157: media recovery not allowed following successful FINISH recovery
Managed Standby Recovery not using Real Time Apply
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Recovery Slave PR00 previously exited with exception 16157
MRP0: Background Media Recovery process shutdown (commentdb)
[oracle@app73 trace]$ oerr ora 16157
16157, 00000, "media recovery not allowed following successful FINISH recovery"
// *Cause:  A RECOVER MANAGED STANDBY DATABASE FINISH command has previously
//          completed successfully.  Another media recovery is not allowed.
// *Action: Issue one of these operations following a FINISH recocvery:
//          ALTER DATABASE OPEN READ ONLY or
//          ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY.

前台虽然提示mrp启动成功,但是alert日志提示启动mrp进程失败,原因是由于media recovery not allowed following successful FINISH recovery,也就是说由于我开始已经执行了ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH导致无法继续启动mrp进程,也就无法继续应用日志。

查看mos看看是否有解决方案
ora-16157


根据官方的说法只能重建dg,实在不甘心,个人感觉ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH执行完成,但是我没有failover命令执行,应该数据文件没有改变,只是控制文件发生了改变。

解决ora-16157问题
从主库重新生成standby controlfile并且传输到备库,再次尝试启动mrp

--主库操作
SQL> ALTER DATABASE CREATE standby CONTROLFILE AS '/tmp/controlfs01.ctl';
Database altered.
--scp到备库
SQL> shutdown immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL> startup nomount;
ORACLE instance started.
Total System Global Area 2.1379E+10 bytes
Fixed Size                  2262656 bytes
Variable Size            2684356992 bytes
Database Buffers         1.8656E+10 bytes
Redo Buffers               36073472 bytes
[oracle@app73 ~]$ cp /tmp/controlfs01.ctl /Data/oracle/oradata/commentdb/control01.ctl
[oracle@app73 ~]$ cp /tmp/controlfs01.ctl /Data/oracle/fast_recovery_area/commentdb/control02.ctl
SQL> alter database mount;
Database altered.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION;
Database altered.

观察alert日志

ARC2: Becoming the active heartbeat ARCH
Completed: alter database mount
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Attempt to start background Managed Standby Recovery process (commentdb)
Sun Jun 26 23:46:50 2016
MRP0 started with pid=24, OS id=7547
MRP0: Background Managed Standby Recovery process started (commentdb)
Sun Jun 26 23:46:54 2016
RFS[1]: Assigned to RFS process 7553
RFS[1]: Selected log 10 for thread 1 sequence 21197 dbid -414945661 branch 893285763
Sun Jun 26 23:46:54 2016
Primary database is in MAXIMUM PERFORMANCE mode
RFS[2]: Assigned to RFS process 7555
RFS[2]: Selected log 11 for thread 1 sequence 21198 dbid -414945661 branch 893285763
Archived Log entry 1 added for thread 1 sequence 21197 ID 0xe7442b83 dest 1:
 started logmerger process
Sun Jun 26 23:46:55 2016
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 4 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Sun Jun 26 23:46:56 2016
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Sun Jun 26 23:47:06 2016
Media Recovery Waiting for thread 1 sequence 21196
Fetching gap sequence in thread 1, gap sequence 21196-21196
Sun Jun 26 23:47:06 2016
RFS[3]: Assigned to RFS process 7567
RFS[3]: Opened log for thread 1 sequence 21196 dbid -414945661 branch 893285763
Archived Log entry 2 added for thread 1 sequence 21196 rlc 893285763 ID 0xe7442b83 dest 2:
Media Recovery Log /Data/oracle/fast_recovery_area/COMMENTDBDG/archivelog/2016_06_26/o1_mf_1_21196_cpzy7tjc_.arc
Media Recovery Log /Data/oracle/fast_recovery_area/COMMENTDBDG/archivelog/2016_06_26/o1_mf_1_21197_cpzy7gtl_.arc
Media Recovery Waiting for thread 1 sequence 21198 (in transit)
Recovery of Online Redo Log: Thread 1 Group 11 Seq 21198 Reading mem 0
  Mem# 0: /Data/oracle/oradata/commentdb/std_redo11.log

到这里已经证明,通过重建standby controlfile实现了即使执行了ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH(ORA-16157)也可以通过不重建dg,让其恢复正常(恢复dg状态).

pvid=yes导致asm无法mount

今天凌晨接到客户恢复请求,对于aix rac,两个ibm存储做mirror的环境中,客户做存储容灾演练,发现磁盘的名称发生改变,然后对其中一个磁盘设置pvid,结果悲剧了导致asm一个磁盘组无法正常起来。然后又aix端删除这些设备,然后重新扫描设备。结果不是一个磁盘组不能mount,而是整个gi就无法正常启动。希望我们给予技术支持。
查看asm 日志,确定asm disk信息
asm-disk1
asm-disk2


从这里可以确定,一共有两个asm diskgroup,每个group有两个磁盘,hdisk2和hdisk3 为hisdata,hdisk4,和hdisk5为emrdata.

使用kfed分析磁盘头

dd if=/dev/rhdisk2 of=/tmp/xifenfei/rhdisk2.dd bs=1024k count=10
dd if=/dev/rhdisk3 of=/tmp/xifenfei/rhdisk3.dd bs=1024k count=10
dd if=/dev/rhdisk4 of=/tmp/xifenfei/rhdisk4.dd bs=1024k count=10
dd if=/dev/rhdisk5 of=/tmp/xifenfei/rhdisk5.dd bs=1024k count=10
--传输到我电脑上分析
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk2.dd|grep name
kfdhdb.dskname:            HISDATA_0000 ; 0x028: length=12
kfdhdb.grpname:                 HISDATA ; 0x048: length=7
kfdhdb.fgname:             HISDATA_0000 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk3.dd|grep name
kfdhdb.dskname:            HISDATA_0001 ; 0x028: length=12
kfdhdb.grpname:                 HISDATA ; 0x048: length=7
kfdhdb.fgname:             HISDATA_0001 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk4.dd|grep name
kfdhdb.dskname:            EMRDATA_0000 ; 0x028: length=12
kfdhdb.grpname:                 EMRDATA ; 0x048: length=7
kfdhdb.fgname:             EMRDATA_0000 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd|grep name
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd
kfbh.endian:                        201 ; 0x000: 0xc9
kfbh.hard:                          194 ; 0x001: 0xc2
kfbh.type:                          212 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                        193 ; 0x003: 0xc1
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
000000000 C1D4C2C9 00000000 00000000 00000000  [................]
000000010 00000000 00000000 00000000 00000000  [................]
  Repeat 254 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][212]
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd blkn=2|grep kfbh
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                33554432 ; 0x004: blk=33554432
kfbh.block.obj:                16777344 ; 0x008: file=128
kfbh.check:                  2654889601 ; 0x00c: 0x9e3e6681
kfbh.fcn.base:               1696071680 ; 0x010: 0x65180000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd blkn=510|grep name
kfdhdb.dskname:            EMRDATA_0001 ; 0x028: length=12
kfdhdb.grpname:                 EMRDATA ; 0x048: length=7
kfdhdb.fgname:             EMRDATA_0001 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0

通过上述分析,基本上确定由于对hdisk5设置了pvid导致该asm disk的磁盘头损坏.这个可以直接使用asm repair功能修复(注意要clear pvid)

C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd |grep name
kfdhdb.dskname:            EMRDATA_0001 ; 0x028: length=12
kfdhdb.grpname:                 EMRDATA ; 0x048: length=7
kfdhdb.fgname:             EMRDATA_0001 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0

启动crs到cssd进程报错分析
1. 由于删除磁盘,扫描设备导致hdisk[2-5] 权限和用户组不对
2. 由于删除,扫描磁盘导致磁盘共享模式不对
修复磁盘头和解决这两个问题之后,gi启动正常,磁盘组也正常mount,数据库也正常启动,数据0丢失,至此完美恢复
oracle-open


类似客户恢复案例:asm disk误设置pvid导致asm diskgroup无法mount恢复
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

使用losetup实现linux普通文件做asm disk

上一篇文章《使用_asm_allow_only_raw_disks实现普通文件做asm disk》中已经介绍使用_asm_allow_only_raw_disks参数使得oracle asm可以使用文件作为asm disk,这篇文章介绍在linux中还可以通过losetup来实现文件系统模拟磁盘实现使用文件系统做asm disk的效果
通过dd构造文件

[oracle@xifenfei ~]$ mkdir /u01/oracle/oradata/asmdisk
[oracle@xifenfei ~]$ dd if=/dev/zero of=/u01/oracle/oradata/asmdisk/xifenfei01.dd bs=10240k count=100
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 21.9158 seconds, 47.8 MB/s
[oracle@xifenfei ~]$ dd if=/dev/zero of=/u01/oracle/oradata/asmdisk/xifenfei02.dd bs=10240k count=100
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 22.392 seconds, 46.8 MB/s
[oracle@xifenfei ~]$ ls -lh /u01/oracle/oradata/asmdisk/
total 3.0G
-rw-r--r-- 1 oracle oinstall 1000M Feb 27 22:58 xifenfei01.dd
-rw-r--r-- 1 oracle oinstall 1000M Feb 27 23:00 xifenfei02.dd

使用losetup模拟磁盘

[root@xifenfei asmdisk]# ls -l /dev/lo
log    loop0  loop1  loop2  loop3  loop4  loop5  loop6  loop7
[root@xifenfei asmdisk]# losetup /dev/loop1 xifenfei01.dd
[root@xifenfei asmdisk]# losetup /dev/loop2 xifenfei02.dd

使用raw实现磁盘转换为裸设备

[root@xifenfei asmdisk]# raw  /dev/raw/raw10 /dev/loop1
/dev/raw/raw10: bound to major 7, minor 1
[root@xifenfei asmdisk]# raw  /dev/raw/raw11 /dev/loop2
/dev/raw/raw11: bound to major 7, minor 2
[root@xifenfei asmdisk]# ls -l /dev/raw/raw1[0-1]
crw------- 1 root root 162, 10 Feb 27 23:16 /dev/raw/raw10
crw------- 1 root root 162, 11 Feb 27 23:16 /dev/raw/raw11
[root@xifenfei asmdisk]# chown oracle.dba /dev/raw/raw1[0-1]
[root@xifenfei asmdisk]# ls -l /dev/raw/raw1[0-1]
crw------- 1 oracle dba 162, 10 Feb 27 23:16 /dev/raw/raw10
crw------- 1 oracle dba 162, 11 Feb 27 23:16 /dev/raw/raw11

创建磁盘组

[oracle@xifenfei ~]$ export ORACLE_SID=+ASM
[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Thu Feb 27 23:19:28 2014
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL>  create diskgroup xff external redundancy disk '/dev/raw/raw10','/dev/raw/raw11';
Diskgroup created.
SQL> select group_number,name from v$asm_diskgroup;
GROUP_NUMBER NAME
------------ ------------------------------------------------------------
           1 DATA
           2 XFF
SQL> select path,TOTAL_MB from v$asm_disk where group_number=2;
PATH                   TOTAL_MB
-------------------- ----------
/dev/raw/raw11             1000
/dev/raw/raw10             1000

通过kfed验证asm disk是数据文件

[oracle@xifenfei tmp]$ kfed read /dev/raw/raw10|grep XFF
kfdhdb.dskname:                XFF_0000 ; 0x028: length=8
kfdhdb.grpname:                     XFF ; 0x048: length=3
kfdhdb.fgname:                 XFF_0000 ; 0x068: length=8
[oracle@xifenfei tmp]$ kfed read /dev/raw/raw11|grep XFF
kfdhdb.dskname:                XFF_0001 ; 0x028: length=8
kfdhdb.grpname:                     XFF ; 0x048: length=3
kfdhdb.fgname:                 XFF_0001 ; 0x068: length=8
[oracle@xifenfei tmp]$ kfed read /u01/oracle/oradata/asmdisk/xifenfei01.dd |grep XFF
kfdhdb.dskname:                XFF_0000 ; 0x028: length=8
kfdhdb.grpname:                     XFF ; 0x048: length=3
kfdhdb.fgname:                 XFF_0000 ; 0x068: length=8
[oracle@xifenfei tmp]$ kfed read /u01/oracle/oradata/asmdisk/xifenfei02.dd |grep XFF
kfdhdb.dskname:                XFF_0001 ; 0x028: length=8
kfdhdb.grpname:                     XFF ; 0x048: length=3
kfdhdb.fgname:                 XFF_0001 ; 0x068: length=8

通过kfed命令,确定asm本质是用了dd出来的数据文件做asm disk.

使用_asm_allow_only_raw_disks实现普通文件做asm disk

不少人可能都有一个烦恼,我们平时使用的是win系统,但是想测试下asm,很可能还要到linux的虚拟机里面,这样太麻烦了。如果能够使用win文件系统里面的文件直接做asm disk,直接给asm 使用那就省事了。这篇文章就描述10g中如何实现win 文件系统中的文件做asm disk.注意:如果11g需要安装grid
准备两个文件

h:\ASMDISK>dd if=/dev/zero of=asmdisk1.dd bs=10240k count=200
rawwrite dd for windows version 0.6beta3.
Written by John Newbigin <jn@it.swin.edu.au>
This program is covered by terms of the GPL Version 2.
200+0 records in
200+0 records out
h:\ASMDISK>dd if=/dev/zero of=asmdisk2.dd bs=10240k count=200
rawwrite dd for windows version 0.6beta3.
Written by John Newbigin <jn@it.swin.edu.au>
This program is covered by terms of the GPL Version 2.
200+0 records in
200+0 records out
h:\ASMDISK>dir
 驱动器 H 中的卷没有标签。
 卷的序列号是 360E-41A8
 h:\ASMDISK 的目录
2016-06-17  22:02    <DIR>          .
2016-06-17  22:02    <DIR>          ..
2016-06-17  22:30     2,097,152,000 asmdisk1.dd
2016-06-17  22:30     2,097,152,000 asmdisk2.dd
               2 个文件  4,194,304,000 字节
               2 个目录 574,891,098,112 可用字节

这里使用dd来构造文件,当然你也可以通过asmtools来实现

增加css服务

C:\Windows\system32>d:/app\product\10.2.0\db_1\bin\localconfig add
Step 1:  creating new OCR repository
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'fal', privgrp ''..
Operation successful.
Step 2:  creating new CSS service
successfully created local CSS service
successfully added CSS to home

使用文件创建asm 磁盘组
dbca选择自动存储管理,并且到选择磁盘步骤终止,因为没有磁盘分区给asm使用

h:\ASMDISK>set ORACLE_SID=+ASM
D:\app\product\10.2.0\db_1\bin>SQLPLUS / AS SYSDBA
SQL*Plus: Release 10.2.0.3.0 - Production on 星期五 6月 17 22:08:57 2016
Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.
连接到:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options
SQL> alter system set "_asm_allow_only_raw_disks"=false
  2  ;
alter system set "_asm_allow_only_raw_disks"=false
                 *
第 1 行出现错误:
ORA-02095: ????????????
SQL> alter system set "_asm_allow_only_raw_disks"=false scope=spfile;
系统已更改。
SQL> shutdown immediate;
ORA-15100: ??????????
ASM 实例已关闭
SQL> startup
ASM 实例已启动
Total System Global Area   83886080 bytes
Fixed Size                  1289028 bytes
Variable Size              57431228 bytes
ASM Cache                  25165824 bytes
ORA-15110: ??????
SQL> create diskgroup data external redundancy disk 'H:\asmdisk\ASMDISK1.dd','H:\asmdisk\ASMDISK2.dd';
create diskgroup data external redundancy disk 'H:\asmdisk\ASMDISK1.dd','H:\asmdisk\ASMDISK2.dd'
*
第 1 行出现错误:
ORA-15018: ???????
ORA-15031: ???? 'H:\asmdisk\ASMDISK2.dd' ?????????
ORA-15014: ?? 'H:\ASMDISK\ASMDISK2.DD' ??????
ORA-15031: ???? 'H:\asmdisk\ASMDISK1.dd' ?????????
ORA-15014: ?? 'H:\ASMDISK\ASMDISK1.DD' ??????
SQL> show parameter asm;
NAME                                 TYPE
------------------------------------ ----------------------
VALUE
------------------------------
_asm_allow_only_raw_disks            boolean
FALSE
asm_diskgroups                       string
asm_diskstring                       string
asm_power_limit                      integer
1
SQL> alter system set asm_diskstring="H:\asmdisk\*.dd" ;
系统已更改。
SQL> create diskgroup data external redundancy disk 'H:\asmdisk\ASMDISK1.dd','H:\asmdisk\ASMDISK2.dd';
磁盘组已创建。
SQL> select path from v$asm_disk;
PATH
--------------------------------------------------------------------------------
H:\ASMDISK\ASMDISK1.DD
H:\ASMDISK\ASMDISK2.DD

这一步你也可以通过ORADIM创建asm服务,然后创建asm参数文件,然后启动asm,创建磁盘组

dbca创建数据库选择asm
asm-win-file


实现效果(普通file做asm disk)
win-file-asm-disk


同理在linux等平台上也可以使用_asm_allow_only_raw_disks参数实现文件系统做asm disk

oracle asm disk格式化恢复—格式化为ext4文件系统

昨天中午接到一位朋友紧急求救电话,大概场景如下,asm data磁盘组一共把个asm disk,但是使用4个lun实现的(也就是说每个lun使用fdisk进行分区),该主机上还有一个lun是用来存放备份的挂载在/xifenfei目录.客户记得他们的/xifenfei目录是/dev/sdh1这个分区,上次主机重启之后,rac工作正常,但是/xifenfei这个目录对应的/dev/sdh1无法挂载上去(提示无法找到超级块).然后我这位朋友上去也不管三七二一直接执行了mkfs.ext4 /dev/sdh1.结果整个asm 磁盘组异常了.最后他通过fdisk命令一看发现完蛋了,以前的/dev/sdh 已经变为了/dev/sdc,而现在的/dev/sdh是以前的asm disk.也就是说,他把asm 磁盘组中的一个disk进行了格式化为ext4文件系统操作.
ext4文件系统大概结构
通过这里大概可以发现在格式化为ext4文件系统并非把所有的磁盘数据全部重写主要就是覆盖一些ext4必要的一些元数据信息,理论上没有覆盖部分的数据依旧可以恢复
ext4


fdisk 分析现在磁盘情况

[root@db3 ~]#  fdisk -l
Disk /dev/sda: 171.8 GB, 171798691840 bytes
255 heads, 63 sectors/track, 20886 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00055b12
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          64      512000   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2              64        8223    65536000   82  Linux swap / Solaris
/dev/sda3            8223       20887   101723136   8e  Linux LVM
Disk /dev/sdb: 751.6 GB, 751619276800 bytes
255 heads, 63 sectors/track, 91379 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00b6888f
   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       91379   734001786    5  Extended
/dev/sdb5               1       15000   120487437   83  Linux
Disk /dev/sdd: 1073 MB, 1073741824 bytes
34 heads, 61 sectors/track, 1011 cylinders
Units = cylinders of 2074 * 512 = 1061888 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00083b68
   Device Boot      Start         End      Blocks   Id  System
Disk /dev/sdf: 526.1 GB, 526133493760 bytes
255 heads, 63 sectors/track, 63965 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000c10d8
   Device Boot      Start         End      Blocks   Id  System
/dev/sdf1               1       13054   104856223+  83  Linux
/dev/sdf2           13055       63965   408942607+  83  Linux
Disk /dev/sde: 1073 MB, 1073741824 bytes
34 heads, 61 sectors/track, 1011 cylinders
Units = cylinders of 2074 * 512 = 1061888 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000b120f
   Device Boot      Start         End      Blocks   Id  System
Disk /dev/sdc: 1795.3 GB, 1795296329728 bytes
255 heads, 63 sectors/track, 218265 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x78c377f7
   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1      218265  1753213581   83  Linux
Disk /dev/sdg: 526.1 GB, 526133493760 bytes
255 heads, 63 sectors/track, 63965 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000ee648
   Device Boot      Start         End      Blocks   Id  System
/dev/sdg1               1       13054   104856223+  83  Linux
/dev/sdg2           13055       63965   408942607+  83  Linux
Disk /dev/sdi: 526.1 GB, 526133493760 bytes
255 heads, 63 sectors/track, 63965 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00027e4b
   Device Boot      Start         End      Blocks   Id  System
/dev/sdi1               1       13054   104856223+  83  Linux
/dev/sdi2           13055       63965   408942607+  83  Linux
Disk /dev/sdh: 526.1 GB, 526133493760 bytes
255 heads, 63 sectors/track, 63965 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000938dc
   Device Boot      Start         End      Blocks   Id  System
/dev/sdh1               1       13054   104856223+  83  Linux
/dev/sdh2           13055       63965   408942607+  83  Linux
Disk /dev/mapper/rootvg-rootfs: 31.7 GB, 31708938240 bytes
255 heads, 63 sectors/track, 3855 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/mapper/rootvg-lv01: 32.2 GB, 32212254720 bytes
255 heads, 63 sectors/track, 3916 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

通过asm alert日志确定asm disk信息
这里可以确定asm disk是VOL01-VOL08一共8个盘

SQL> CREATE DISKGROUP DATA EXTERNAL REDUNDANCY  DISK '/dev/oracleasm/disks/VOL01',
'/dev/oracleasm/disks/VOL02',
'/dev/oracleasm/disks/VOL03',
'/dev/oracleasm/disks/VOL04' ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
SQL> ALTER DISKGROUP DATA ADD  DISK '/dev/oracleasm/disks/VOL05' SIZE 399358M ,
'/dev/oracleasm/disks/VOL06' SIZE 399358M ,
'/dev/oracleasm/disks/VOL07' SIZE 399358M ,
'/dev/oracleasm/disks/VOL08' SIZE 399358M /* ASMCA */
SQL> ALTER DISKGROUP ALL MOUNT /* asm agent call crs *//* {0:0:2} */
NOTE: Diskgroup used for Voting files is:
 	 DATA
Diskgroup with spfile:DATA
Diskgroup used for OCR is:DATA
NOTE: cache registered group DATA number=1 incarn=0x6f3f6eb8
NOTE: cache began mount (not first) of group DATA number=1 incarn=0x6f3f6eb8
NOTE: Assigning number (1,7) to disk (/dev/oracleasm/disks/VOL08)
NOTE: Assigning number (1,6) to disk (/dev/oracleasm/disks/VOL07)
NOTE: Assigning number (1,5) to disk (/dev/oracleasm/disks/VOL06)
NOTE: Assigning number (1,4) to disk (/dev/oracleasm/disks/VOL05)
NOTE: Assigning number (1,3) to disk (/dev/oracleasm/disks/VOL04)
NOTE: Assigning number (1,1) to disk (/dev/oracleasm/disks/VOL02)
NOTE: Assigning number (1,2) to disk (/dev/oracleasm/disks/VOL03)
NOTE: Assigning number (1,0) to disk (/dev/oracleasm/disks/VOL01)
GMON querying group 1 at 2 for pid 23, osid 32026
NOTE: cache opening disk 0 of grp 1: DATA_0000 path:/dev/oracleasm/disks/VOL01
NOTE: cache opening disk 1 of grp 1: DATA_0001 path:/dev/oracleasm/disks/VOL02
NOTE: cache opening disk 2 of grp 1: DATA_0002 path:/dev/oracleasm/disks/VOL03
NOTE: cache opening disk 3 of grp 1: DATA_0003 path:/dev/oracleasm/disks/VOL04
NOTE: cache opening disk 4 of grp 1: DATA_0004 path:/dev/oracleasm/disks/VOL05
NOTE: cache opening disk 5 of grp 1: DATA_0005 path:/dev/oracleasm/disks/VOL06
NOTE: cache opening disk 6 of grp 1: DATA_0006 path:/dev/oracleasm/disks/VOL07
NOTE: cache opening disk 7 of grp 1: DATA_0007 path:/dev/oracleasm/disks/VOL08
NOTE: cache mounting (not first) external redundancy group 1/0x6F3F6EB8 (DATA)

查询asm对应的disk信息
这部分信息,可以确定VOL03被格式化,而且应该就是朋友反馈的被他格式化成ext4的/dev/sdh1
asm1


[root@db3 disks]# ls
VOL01  VOL02  VOL04  VOL05  VOL06  VOL07  VOL08
[root@db3 disks]# oracleasm querydisk -p 'VOL01'
Disk "VOL01" is a valid ASM disk
/dev/sdf1: LABEL="VOL01" TYPE="oracleasm"
[root@db3 disks]# oracleasm querydisk -p 'VOL02'
Disk "VOL02" is a valid ASM disk
/dev/sdg1: LABEL="VOL02" TYPE="oracleasm"
[root@db3 disks]# oracleasm querydisk -p 'VOL03'
Disk "VOL03" does not exist or is not instantiated
[root@db3 disks]# oracleasm querydisk -p 'VOL04'
Disk "VOL04" is a valid ASM disk
/dev/sdi1: LABEL="VOL04" TYPE="oracleasm"
[root@db3 disks]# oracleasm querydisk -p 'VOL05'
Disk "VOL05" is a valid ASM disk
/dev/sdf2: LABEL="VOL05" TYPE="oracleasm"
[root@db3 disks]# oracleasm querydisk -p 'VOL06'
Disk "VOL06" is a valid ASM disk
/dev/sdg2: LABEL="VOL06" TYPE="oracleasm"
[root@db3 disks]# oracleasm querydisk -p 'VOL07'
Disk "VOL07" is a valid ASM disk
/dev/sdh2: LABEL="VOL07" TYPE="oracleasm"
[root@db3 disks]# oracleasm querydisk -p 'VOL08'
Disk "VOL08" is a valid ASM disk
/dev/sdi2: LABEL="VOL08" TYPE="oracleasm"

通过kfed分析被格式化成ext4的磁盘信息

[grid@db3 ~]$ kfed read /dev/sdh1
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                   810307429 ; 0x00c: 0x304c4f65
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
000000000 00000000 00000000 00000000 304C4F65  [............eOL0]
000000010 00000000 00000000 00000000 00000000  [................]
000000020 4C43524F 44524C43 00000000 00000000  [ORCLCLRD........]
000000030 00000000 00000000 00000000 00000000  [................]
  Repeat 252 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
[grid@db3 ~]$ kfed read /dev/sdh1 aun=1
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          129 ; 0x001: 0x81
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                   98560 ; 0x004: blk=98560
kfbh.block.obj:                  164096 ; 0x008: file=164096
kfbh.check:                      229632 ; 0x00c: 0x00038100
kfbh.fcn.base:                   295168 ; 0x010: 0x00048100
kfbh.fcn.wrap:                   819456 ; 0x014: 0x000c8100
kfbh.spare1:                     884992 ; 0x018: 0x000d8100
kfbh.spare2:                    1605888 ; 0x01c: 0x00188100
000100000 00008100 00018100 00028100 00038100  [................]
000100010 00048100 000C8100 000D8100 00188100  [................]
000100020 00288100 003E8100 00798100 00AB8100  [..(...>...y.....]
000100030 01388100 016C8100 00000000 00000000  [..8...l.........]
000100040 00000000 00000000 00000000 00000000  [................]
  Repeat 251 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
[grid@db3 ~]$ kfed read /dev/sdh1 aun=1 blkn=254
kfbh.endian:                        254 ; 0x000: 0xfe
kfbh.hard:                          129 ; 0x001: 0x81
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                   98814 ; 0x004: blk=98814
kfbh.block.obj:                  164350 ; 0x008: file=164350
kfbh.check:                      229886 ; 0x00c: 0x000381fe
kfbh.fcn.base:                   295422 ; 0x010: 0x000481fe
kfbh.fcn.wrap:                   819710 ; 0x014: 0x000c81fe
kfbh.spare1:                     885246 ; 0x018: 0x000d81fe
kfbh.spare2:                    1606142 ; 0x01c: 0x001881fe
0001FE000 000081FE 000181FE 000281FE 000381FE  [................]
0001FE010 000481FE 000C81FE 000D81FE 001881FE  [................]
0001FE020 002881FE 003E81FE 007981FE 00AB81FE  [..(...>...y.....]
0001FE030 013881FE 016C81FE 00000000 00000000  [..8...l.........]
0001FE040 00000000 00000000 00000000 00000000  [................]
  Repeat 251 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
[grid@db3 ~]$ kfed read /dev/sdh1 aun=9
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
000900000 00000000 00000000 00000000 00000000  [................]
  Repeat 255 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

从这里可以看出来该asm的元数据信息大量被破坏(asm 备份的磁盘头也损坏),直接修复的可能性基本上为0,现在为了最大限度的恢复数据,通过底层扫描asm au结合数据库block信息,进行重组出来数据文件(这次的恢复有点难度,因为该asm磁盘组中有两个数据库,需要通过结合asm file+datafile来区分文件)
恢复出来两套库的数据文件信息

/orabak/xifenfei/oradb/1.dbf
/orabak/xifenfei/oradb/2.dbf
/orabak/xifenfei/oradb/3.dbf
/orabak/xifenfei/oradb/4.dbf
/orabak/xifenfei/oradb/5.dbf
/orabak/xifenfei/oradb/6.dbf
/orabak/xifenfei/oradb/7.dbf
/orabak/xifenfei/oradb/8.dbf
/orabak/xifenfei/oradb/9.dbf
/orabak/xifenfei/oradb/10.dbf
/orabak/xifenfei/oradb/11.dbf
/orabak/xifenfei/oradb/12.dbf
/orabak/xifenfei/oradb/13.dbf
/orabak/xifenfei/oradb/14.dbf
/orabak/xifenfei/oradb/15.dbf
/orabak/xifenfei/oradb/16.dbf
/orabak/xifenfei/oradb/17.dbf
/orabak/xifenfei/oradb/18.dbf
/orabak/xifenfei/oradb/19.dbf
/orabak/xifenfei/xgdb/1.dbf
/orabak/xifenfei/xgdb/2.dbf
/orabak/xifenfei/xgdb/3.dbf
/orabak/xifenfei/xgdb/4.dbf
/orabak/xifenfei/xgdb/5.dbf
/orabak/xifenfei/xgdb/6.dbf
/orabak/xifenfei/xgdb/7.dbf
/orabak/xifenfei/xgdb/8.dbf
/orabak/xifenfei/xgdb/9.dbf
/orabak/xifenfei/xgdb/10.dbf
/orabak/xifenfei/xgdb/11.dbf
/orabak/xifenfei/xgdb/12.dbf
/orabak/xifenfei/xgdb/13.dbf
/orabak/xifenfei/xgdb/14.dbf
/orabak/xifenfei/xgdb/15.dbf
/orabak/xifenfei/xgdb/16.dbf

然后使用工具拷贝出来redo信息,最后通过重建控制文件.其中一套库完美直接open,另外一套库system中的c_obj$被覆盖,不过使用一个多月以前的备份的system文件强制打开库成功,数据基本上完美导出,实现完美恢复.由于在格式化为ext4的时候,会在磁盘中部分位置写入一些
数据文件恢复参考:asm disk header 彻底损坏恢复
另外有一次win平台类似恢复经历:asm disk格式化为ntfs恢复
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com