To find the TX Enqueue contention in a RAC or OPS environment

今天查找TX Enqueue看到的一篇文章,拿出来共享下

PURPOSE
-------------
To find the TX Enqueue contention in a RAC or OPS environment
What is TX Enqueue ?
In one word oracle is maintaining queue for transaction.
How Many Resources ?
1/ active transaction
How Many Locks?
1/transaction + 1/process waiting for a locked row by that
transaction.
How Many Users?
1 + 1/ process waiting for something locked by this transaction.
Who Uses?
All processes
What need to investigate?
The mode of TX (6/4), Holding/Waiting/Requesting
SCOPE & APPLICATION
=====================
This document will help to analyze the application design related to transaction bottlenecks
and database performance tuning.
Let start with an example:
===================
create table akdas (A1 number, Col1 Varchar2(10), Col2 Varchar2(10));
insert into akdas values(5,'Hello','Hi');
insert into akdas values(6,'Sudip','Datta');
insert into akdas values(7,'Preetam','Roy');
insert into akdas values(8,'Michael','Polaski');
From Node 1:
==========
update akdas set a1=11 where a1=6;
From Node 2:
==========
update akdas set a1=12 where a1=7;
update akdas set a1=11 where a1=6;  /* this will wait for Node1: to complete the transaction */
This Note Is Made To Analyzing Only the TX-Mode-6 (Exclusive).
1. Now run the following query to track down the problem: Who is waiting
===================================================================
prompt
prompt Query 1. Waiting for TX Enqueue where mode is Exclusive
prompt =====================================
prompt
set linesize 100
set pagesize 66
col c1 for a15
col c1 heading "Program Name "
select l.inst_id,l.SID,program c1,l.TYPE,l.ID1,l.ID2,l.LMODE,l.REQUEST
from gv$lock l,gv$session s
where l.type like 'TX' and l.REQUEST =6
and l.inst_id=s.inst_id and l.sid=s.sid
order by id1
/
Output will be here
===============
   INST_ID      SID     Program Name       TY     ID1     ID2       LMODE      REQUEST
-----------  ---------- ------------------ ---   -------- --------  ---------- --------
         2           13  sqlplus@opcbsol   TX     393236  780       0          6
                         2 (TNS V1-V3)
It is clear that SID 12 of instance 2 is doing a DML and waiting on REQUEST Mode 6.
2. Let's run the next query to find who is holding
===========================================
prompt
prompt
prompt Query 2. Holding for TX Enqueue where mode greater than 6
prompt =======================================
prompt
set linesize 100
set pagesize 66
col c1 for a15
col c1 heading "Program Name "
select l.inst_id,l.SID,program c1,l.TYPE,l.ID1,l.ID2,l.LMODE,l.REQUEST
from gv$lock l,gv$session s
where l.type like 'TX' and l.LMODE =6 and (l.ID1,l.ID2) in
(select id1,id2 from gv$lock where type like 'TX' and REQUEST =6)
and l.inst_id=s.inst_id and l.sid=s.sid
order by id1
/
Output will be here
===============
   INST_ID      SID     Program Name      TY        ID1        ID2      LMODE    REQUEST
   ----------  ---------- -------------- ---   ---------- --------   ----------- --------
         1          12    sqlplus@opcbsol TX     393236        780      6          0
                          1 (TNS V1-V3)
So holder is SID 12 on instance 1. Where LMODE = 6.
3. Let's find out the exact file#, block# and Record# where it is waiting
===============================================================
prompt
prompt
prompt Query 3. Object# ,File#, Block# and Slot# TX Enqueue in detail
prompt ========================================
prompt
set linesize 110
col c0 for 999
col c0 heading "INS"
col c1 for a15
col c1 heading "Program Name "
select inst_id c0,sid,program c1,ROW_WAIT_OBJ# object_no, ROW_WAIT_FILE# Rfile_no,
ROW_WAIT_BLOCK# Block_no ,ROW_WAIT_ROW# Row_no
from gv$session
where (inst_id,sid) in (select inst_id,sid from gv$session_wait where p1='1415053318')
/
Output Will be here
===============
 INS     SID    Program Name     OBJECT_NO RFILE_NO BLOCK_NO  ROW_NO
----- ---------- -------------   ---------------    --------- -------
   2         13     sqlplus@opcbsol  7261      9        12346     1
                      2 (TNS V1-V3)
From the output, it is clear that it is waiting on Relative_File# 9, Block# 12346, Row Number 1.
Here Row Number 1 means the slot number in the block 12346. This Row_No start from 0 (zero).
4. Let's Find the object details
=============================
prompt
prompt
prompt Query 4. Object Involve for TX Enqueue in detail
prompt ===============================
prompt
set linesize 100
set pagesize 100
col owner for a10
col object_name for a20
col object_type for a10
select owner,object_name,object_id,object_type
from dba_objects
where
object_id in (select ROW_WAIT_OBJ# from gv$session
where (inst_id, sid) in (select inst_id,sid from gv$session_wait where p1='1415053318'))
/
Output Will be here
===============
OWNER      OBJECT_NAME  OBJECT_ID   OBJECT_TYP
---------  ------------ --------    -----------
AKDAS      AKDAS        7261        TABLE
5. Let’s find the row value details
=============================
prompt
prompt
prompt Query 5. Finding the row value
prompt ====================
prompt
select * from <Owner>.<Table Name>  where rowid like
DBMS_ROWID.ROWID_CREATE(1,&Object_No,&Rfile_No, &Block_No, &Row_Number)
/
From query 3 and 4  we will get the value for all variables.
Owner = AKDAS
Table_Name = AKDAS
Object_No = 7261
Rfile_No =  9
Block_No = 12346
Row_Number = 1
Output Will be here
===============
        A1    Col1                 Col2
  ---------- --------------- ----------
         6      Hello                Hi
So we can drag down to the row value where TX Enqueue contention exists.
6. Let’s find the user activity that is "Holder" and "Waiter"
====================================================
set linesize 120
set pagesize 66
col c0 for 999
col c0 heading "INS"
col c1 for a9
col c1 heading "OS User"
col c2 for a9
col c2 heading "Oracle User"
col c3 for a15
col c3 heading "Program Name"
col b1 for a9
col b1 heading "Unix PID"
col b2 for 9999 justify left
col b2 heading "ORA SID"
col b3 for 999999 justify left
col b3 heading "SERIAL#"
col sql_text for a45
set space 1
break on b1 nodup on c0 nodup on c3 nodup on c1 nodup on c2 nodup on b2 nodup on b3 skip 2
select a.inst_id c0,b.sid b2,c.spid b1, b.program c3, b.username c2,b.serial# b3, a.sql_text
  from gv$sql a, gv$session b, gv$process c
 where
   a.address = b.sql_address
   and b.paddr = c.addr
   and a.hash_value = b.sql_hash_value
   and a.inst_id=b.inst_id and a.inst_id=c.inst_id
   and a.inst_id like '&inst_id' and b.sid like '&sid'
 order by c.spid,a.hash_value
/
This query asks the Instance Number and Sid number, which you can get from step 1 and 2.
But remember , you can see the waiter activity, but you may not see the holder activity.
Reason is, the holder is sitting idle after doing the DML operation. So SQL for Holder
should not be seen under gv$sql.
This all query can be run for single instance database, but all GV$ view need to replace to V$
and there is no INST_ID for V$ View, that part need to be taken care.

来自:How to Find TX Enqueue Contention in RAC or OPS [ID 179582.1]

Oracle 11g丢失access$恢复方法

最近接触到两个案例都是11g数据库因为异常关闭导致access$表丢失,使得数据库不能正常open.为什么这个表会丢失还未找到原因.我这里提供一种在upgrade模式下解决给问题方法.
数据库版本

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> select to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') "xifenfei" from dual;
xifenfei
--------------------------------------
2012-06-22 05:28:57

数据库启动报ORA-00704

SQL> startup
ORACLE instance started.
Total System Global Area  523108352 bytes
Fixed Size                  1346052 bytes
Variable Size             448792060 bytes
Database Buffers           67108864 bytes
Redo Buffers                5861376 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-00942: table or view does not exist
Process ID: 1782
Session ID: 125 Serial number: 5

找出ORA-00704报错原因

SQL> conn / as sysdba
Connected to an idle instance.
SQL> startup mount;
ORACLE instance started.
Total System Global Area  523108352 bytes
Fixed Size                  1346052 bytes
Variable Size             448792060 bytes
Database Buffers           67108864 bytes
Redo Buffers                5861376 bytes
Database mounted.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12
Statement processed.
SQL> oradebug TRACEFILE_NAME
/u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_2010.trc
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-00942: table or view does not exist
Process ID: 2010
Session ID: 125 Serial number: 5

查看trace文件发现

PARSE ERROR #3063868604:len=56 dep=1 uid=0 oct=3 lid=0 tim=1340312320595472 err=942
select order#,columns,types from access$ where d_obj#=:1
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-00942: table or view does not exist
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-00942: table or view does not exist
*** 2012-06-22 04:58:40.596
USER (ospid: 2010): terminating the instance due to error 704

启动数据库至upgrade模式

SQL> startup  upgrade
ORACLE instance started.
Total System Global Area  523108352 bytes
Fixed Size                  1346052 bytes
Variable Size             448792060 bytes
Database Buffers           67108864 bytes
Redo Buffers                5861376 bytes
Database mounted.
Database opened.

创建access$表和index

SQL> create table access$
  2  ( d_obj#        number not null,
  3    order#        number not null,
  4    columns       raw(126),
  5    types         number not null)
  6    storage (initial 10k next 100k maxextents unlimited pctincrease 0)
  7  /
Table created.
SQL> create index i_access1 on
  2    access$(d_obj#, order#)
  3    storage (initial 10k next 100k maxextents unlimited pctincrease 0)
  4  /
Index created.
--创建语句可以在?\RDBMS\ADMIN\dcore.bsq中找到

重启数据库

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area  523108352 bytes
Fixed Size                  1346052 bytes
Variable Size             448792060 bytes
Database Buffers           67108864 bytes
Redo Buffers                5861376 bytes
Database mounted.
Database opened.

access$表作用(感谢vmcd同学提供)
When a database object is first referenced in a PL/SQL program, the PL/SQL engine checks the ACCESS$ table (owned by SYS) to see if the executor of the program has authority on that database object.
对于access$表丢失以前记录是否对系统产生严重影响还未知,希望知道的朋友告知下

spfile被覆盖导致ORA-600[kmgs_parameter_update_timeout_1]

数据库出现如下错误ORA-00600[kmgs_parameter_update_timeout_1]

Thu Jun 21 17:42:45 BEIST 2012
alter tablespace TS_TAB_WG_SYSMGR_01 add datafile '/dev/rvgoradata3_1_01'
Thu Jun 21 17:42:58 BEIST 2012
Completed: alter tablespace TS_TAB_WG_SYSMGR_01 add datafile '/dev/rvgoradata3_1_01'
Thu Jun 21 17:45:31 BEIST 2012
System State dumped to trace file /oracle/app/oracle/admin/bomc3/bdump/bomc3_mmon_19530138.trc
Thu Jun 21 17:45:42 BEIST 2012
Errors in file /oracle/app/oracle/admin/bomc3/bdump/bomc3_mmon_19530138.trc:
ORA-00600: internal error code, arguments: [kmgs_parameter_update_timeout_1], [1565], [], [], [], [], [], []
ORA-01565: error in identifying file '/dev/rvgoradata3_1_01'
ORA-27086: unable to lock file - already in use
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 8
Additional information: 18874484
Thu Jun 21 17:45:49 BEIST 2012
Errors in file /oracle/app/oracle/admin/bomc3/bdump/bomc3_dbw0_18874484.trc:
ORA-00600: internal error code, arguments: [ksprcvsp1], [0], [0], [], [], [], [], []
Thu Jun 21 17:45:52 BEIST 2012
Errors in file /oracle/app/oracle/admin/bomc3/bdump/bomc3_dbw0_18874484.trc:
ORA-00600: internal error code, arguments: [kmgs_parameter_update_timeout_1], [600], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [ksprcvsp1], [0], [0], [], [], [], [], []
Thu Jun 21 17:45:53 BEIST 2012
Errors in file /oracle/app/oracle/admin/bomc3/bdump/bomc3_dbw0_18874484.trc:
ORA-00600: internal error code, arguments: [kmgs_parameter_update_timeout_1], [600], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [ksprcvsp1], [0], [0], [], [], [], [], []
Thu Jun 21 17:45:53 BEIST 2012
DBW0: terminating instance due to error 471
Instance terminated by DBW0, pid = 18874484

通过这个错误可以看出大概:TS_TAB_WG_SYSMGR_01增加数据文件/dev/rvgoradata3_1_01成功后,然后mmon启动收集统计信息,读取spfile文件信息出错.最后dbw进程读取spfile文件出错,使得dbwn进程终止,从而数据库abort掉.通过这些信息,初步怀疑是增加数据文件的时候,错误的把spfile文件的裸设备作为一个新数据文件增加到数据库中,导致spfile被覆盖,从而出现mmon和dbwn访问spfile出错.

找出证据
如果spfile使用裸设备而且文件名是dev/rvgoradata3_1_01,那很可能是通过init_SID.ora中的spfile项实现,查找该文件内容果然发现

[zwq_acc1:/home/xifenfei]cat initbomc3.ora
spfile='/dev/rvgoradata3_1_01'

通过这些可以确定是用户增加数据文件时,错误的把spfile文件当中新的控制问及爱你增加到相关表空间中导致该问题.

解决办法
1.如果有备份spfile文件,使用备份spfile文件
2.如果有pfile文件,使用pfile创建spfile
3.如果上面两个都没有,那么使用alert中相关信息创建pfile文件然后创建spfile

实现trigger集中记录所有库ddl操作

今天客户说了一个我感觉有意思的需求:在一个库上的一张表记录所有库的ddl操作,实现方式:在一个库上建立表和触发器,其他库上通过dblink+同义词+触发器实现ddl操作记录到远程的表中.他当时写了一个触发器,但是有错误,想让我协助解决.在我们的一起努力下,解决了该触发器在dblink同义词的库上出错的问题.我这里测试使用的是10g的库做为存储所有库的ddl记录的库,11g库做为一个通过dblink插入ddl操作记录的库.
在10g数据库库中操作
1.创建记录ddl操作表

SQL> conn chf/xifenfei
Connected.
SQL> create table t_ddl_audit(
  2  db_name varchar2(30),
  3  login_user varchar2(30),
  4  ddl_time date,
  5  ip_address varchar2(20),
  6  audsid varchar2(20),
  7  schema_user varchar2(30),
  8  schema_object varchar2(40),
  9  login_tool varchar2(40),
 10  os_user varchar2(40),
 11  ddl_sql varchar2(4000));
Table created.

2.创建触发器

SQL> create or replace trigger tri_ddl_audit
  2    before ddl on database
  3  declare
  4    n           number;
  5    str_stmt    varchar2(4000);
  6    sql_text    ora_name_list_t;
  7    l_trace     number;
  8    v_module    varchar2(50);
  9    v_action    varchar2(50);
 10    str_session v$session%rowtype;
 11  begin
 12    n := ora_sql_txt(sql_text);
 13    for i in 1 .. n loop
 14      str_stmt := substr(str_stmt || sql_text(i), 1, 3000);
 15    end loop;
 16    dbms_application_info.READ_MODULE(v_module, v_action);
 17    INSERT INTO chf.t_ddl_audit
 18      (db_name,
 19       login_user,
 20       ddl_time,
 21       ip_address,
 22       audsid,
 23       schema_user,
 24       schema_object,
 25       login_tool,
 26       os_user,
 27       ddl_sql)
 28    VALUES
 29      (sys_context('USERENV', 'db_name'),
 30       ora_login_user,
 31       SYSDATE,
 32       sys_context('USERENV', 'IP_ADDRESS'),
 33       userenv('SESSIONID'),
 34       ora_dict_obj_owner,
 35       ora_dict_obj_name,
 36       v_module,
 37       sys_context('userenv', 'os_user'),
 38       str_stmt);
 39  exception
 40    when no_data_found then
 41      null;
 42  end;
 43  /
Trigger created.

3.测试触发器

SQL> conn chf/xifenfei
Connected.
SQL> create table t_xff as select * from dba_tables where rownum=1;
Table created.
SQL> select db_name,login_user,ddl_sql from t_ddl_audit;
DB_NAME                        LOGIN_USER
------------------------------ ------------------------------
DDL_SQL
-----------------------------------------------------------------
XFF                            CHF
create table t_xff as select * from dba_tables where rownum=1

在11g数据库中操作
1.创建dblink和同义词

SQL> create database link "ora10g_dblink"
  2   connect to chf
  3    identified by "xifenfei"
  4     using 'ora10g';
Database link created.
SQL> create  synonym t_ddl_audit for t_ddl_audit@ora10g_dblink;
Synonym created.

2.第一次创建触发器

SQL> create or replace trigger tri_ddl_audit
  2    before ddl on database
  3  declare
  4    n           number;
  5    str_stmt    varchar2(4000);
  6    sql_text    ora_name_list_t;
  7    l_trace     number;
  8    v_module    varchar2(50);
  9    v_action    varchar2(50);
 10    str_session v$session%rowtype;
 11  begin
 12    n := ora_sql_txt(sql_text);
 13    for i in 1 .. n loop
 14      str_stmt := substr(str_stmt || sql_text(i), 1, 3000);
 15    end loop;
 16    dbms_application_info.READ_MODULE(v_module, v_action);
 17    INSERT INTO t_ddl_audit
 18      (db_name,
 19       login_user,
 20       ddl_time,
 21       ip_address,
 22       audsid,
 23       schema_user,
 24       schema_object,
 25       login_tool,
 26       os_user,
 27       ddl_sql)
 28    VALUES
 29      (sys_context('USERENV', 'db_name'),
 30       ora_login_user,
 31       SYSDATE,
 32       sys_context('USERENV', 'IP_ADDRESS'),
 33       userenv('SESSIONID'),
 34       ora_dict_obj_owner,
 35       ora_dict_obj_name,
 36       v_module,
 37       sys_context('userenv', 'os_user'),
 38       str_stmt);
 39  exception
 40    when no_data_found then
 41      null;
 42  end;
 43  /
Trigger created.

3.测试触发器

SQL> create table t_xff as select * from dba_objects where rownum<10;
create table t_xff as select * from dba_objects where rownum<10
                                    *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-02070: database  does not support  in this context
ORA-06512: at line 15

出现ORA-02070错误,估计是类此sys_context(‘userenv’, ‘os_user’)导致。

4.第二次创建触发器

SQL> create or replace trigger tri_ddl_audit
  2    before ddl on database
  3  declare
  4    n           number;
  5    str_stmt    varchar2(4000);
  6    sql_text    ora_name_list_t;
  7    l_trace     number;
  8    v_module    varchar2(50);
  9    v_action    varchar2(50);
 10    v_db_name   varchar2(50);
 11    v_ip_addr   varchar2(50);
 12    v_os        varchar2(50);
 13    v_session_id varchar2(50);
 14    str_session v$session%rowtype;
 15  begin
 16    n := ora_sql_txt(sql_text);
 17    for i in 1 .. n loop
 18      str_stmt := substr(str_stmt || sql_text(i), 1, 3000);
 19    end loop;
 20    dbms_application_info.READ_MODULE(v_module, v_action);
 21    v_db_name :=sys_context('USERENV', 'db_name');
 22    v_ip_addr :=sys_context('USERENV', 'IP_ADDRESS');
 23    v_os:=sys_context('userenv', 'os_user');
 24    v_session_id:=userenv('SESSIONID');
 25    INSERT INTO t_ddl_audit
 26      (db_name,
 27       login_user,
 28       ddl_time,
 29       ip_address,
 30       audsid,
 31       schema_user,
 32       schema_object,
 33       login_tool,
 34       os_user,
 35       ddl_sql)
 36    VALUES
 37      (v_db_name,
 38       ora_login_user,
 39       SYSDATE,
 40       v_ip_addr,
 41      v_session_id,
 42       ora_dict_obj_owner,
 43       ora_dict_obj_name,
 44       v_module,
 45       v_os,
 46       str_stmt);
 47  exception
 48    when no_data_found then
 49      null;
 50  end;
 51  /
Trigger created.

5.继续测试触发器

SQL> drop table t3;
drop table t3
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-02069: global_names parameter must be set to TRUE for this operation
ORA-06512: at line 23

根据ORA-02069,查询资料发现是通过dblink插入数据使用了变量和常量的方式混合使用导致该错误,修改触发器全部使用变量方式

6.第三次创建触发器

SQL> create or replace trigger tri_ddl_audit
  2    before ddl on database
  3  declare
  4    n           number;
  5    str_stmt    varchar2(4000);
  6    sql_text    ora_name_list_t;
  7    l_trace     number;
  8    v_module    varchar2(50);
  9    v_action    varchar2(50);
 10    v_db_name   varchar2(50);
 11    v_ip_addr   varchar2(50);
 12    v_os        varchar2(50);
 13    v_session_id varchar2(50);
 14    v_loginuser    varchar2(50);
 15     v_obj_name varchar2(50);
 16    v_owner    varchar2(50);
 17    str_session v$session%rowtype;
 18  begin
 19    n := ora_sql_txt(sql_text);
 20    for i in 1 .. n loop
 21      str_stmt := substr(str_stmt || sql_text(i), 1, 3000);
 22    end loop;
 23    dbms_application_info.READ_MODULE(v_module, v_action);
 24    v_db_name :=sys_context('USERENV', 'db_name');
 25    v_ip_addr :=sys_context('USERENV', 'IP_ADDRESS');
 26    v_os:=sys_context('userenv', 'os_user');
 27    v_session_id:=userenv('SESSIONID');
 28    v_loginuser:= ora_login_user;
 29    v_owner:=ora_dict_obj_owner;
 30    v_obj_name:=ora_dict_obj_name;
 31    INSERT INTO t_ddl_audit
 32      (db_name,
 33       login_user,
 34       ddl_time,
 35       ip_address,
 36       audsid,
 37       schema_user,
 38       schema_object,
 39       login_tool,
 40       os_user,
 41       ddl_sql)
 42    VALUES
 43      (v_db_name,
 44       v_loginuser,
 45       SYSDATE,
 46       v_ip_addr,
 47      v_session_id,
 48       v_owner,
 49       v_obj_name,
 50       v_module,
 51       v_os,
 52       str_stmt);
 53  exception
 54    when no_data_found then
 55      null;
 56  end;
 57  /
Trigger created.

7.测试触发器

SQL> create table t_xff11 as select * from dba_tables where rownum<10;
Table created.
SQL> select db_name,login_user,ddl_sql from t_ddl_audit;
DB_NAME                        LOGIN_USER
------------------------------ ------------------------------
DDL_SQL
-----------------------------------------------------------------
ora11g                         CHF
create table t_xff11 as select * from dba_tables where rownum<10
XFF                            CHF
create table t_xff as select * from dba_tables where rownum=1

补充说明
这个方案个人感觉是一个实验室中的方案,在实际的生成环境中很难应用上
1.trigger记录ddl操作本身效率不高
2.如果某个库不能访问存储ddl操作的表的数据库,将导致该数据库所有ddl操作hang住,从而可能使得该数据库hang住的风险.

使用 dul 挖数据文件初试

最近测试了下dul,整体感觉和odu差不多
1.配置init.dul

[oracle@xifenfei dul]$ more init.dul
osd_big_endian_flag=false
osd_dba_file_bits=10
osd_c_struct_alignment=32
osd_file_leader_size=1
osd_word_size = 32
dc_columns=2000000
dc_tables=10000
dc_objects=1000000
dc_users=400
dc_segments=100000
Buffer=10485760
control_file = control.txt
db_block_size=8192
export_mode=true
--false表示是sqlloader,true表示imp
compatible=10

2.配置控制文件

[oracle@xifenfei dul]$ more control.txt
         0          1 /u01/oracle/oradata/XFF/system01.dbf
         1          2 /u01/oracle/oradata/XFF/undotbs01.dbf
         2          3 /u01/oracle/oradata/XFF/sysaux01.dbf
         4          4 /u01/oracle/oradata/XFF/users01.dbf
         6          5 /u01/oracle/oradata/XFF/datfttuser.dbf
--sql语句
select ts#,rfile#,name from v$datafile;

3.启动dul

[oracle@xifenfei dul]$ ./dul
Data UnLoader: 10.2.0.5.13 - Internal Only - on Sun Jun 10 06:39:47 2012
with 64-bit io functions
Copyright (c) 1994 2012 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
Found db_id = 3426707456
Found db_name = XFF

4.加载初始化数据字典

DUL> BOOTSTRAP;
Probing file = 1, block = 377
. unloading table                BOOTSTRAP$
DUL: Warning: block number is non zero but marked deferred trying to process it anyhow
      57 rows unloaded
DUL: Warning: Dictionary cache DC_BOOTSTRAP is empty
Reading BOOTSTRAP.dat 57 entries loaded
Parsing Bootstrap$ contents
Generating dict.ddl for version 10
 OBJ$: segobjno 18, file 1 block 121
 TAB$: segobjno 2, tabno 1, file 1  block 25
 COL$: segobjno 2, tabno 5, file 1  block 25
 USER$: segobjno 10, tabno 1, file 1  block 89
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$   50930 rows unloaded
. unloading table                      TAB$    1593 rows unloaded
. unloading table                      COL$   55163 rows unloaded
. unloading table                     USER$      61 rows unloaded
Reading USER.dat 61 entries loaded
Reading OBJ.dat 50930 entries loaded and sorted 50930 entries
Reading TAB.dat 1593 entries loaded
Reading COL.dat 55163 entries loaded and sorted 55163 entries
Reading BOOTSTRAP.dat 57 entries loaded
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 10
 OBJ$: segobjno 18, file 1 block 121
 TAB$: segobjno 2, tabno 1, file 1  block 25
 COL$: segobjno 2, tabno 5, file 1  block 25
 USER$: segobjno 10, tabno 1, file 1  block 89
 TABPART$: segobjno 266, file 1 block 2121
 INDPART$: segobjno 271, file 1 block 2161
 TABCOMPART$: segobjno 288, file 1 block 2297
 INDCOMPART$: segobjno 293, file 1 block 2345
 TABSUBPART$: segobjno 278, file 1 block 2217
 INDSUBPART$: segobjno 283, file 1 block 2257
 IND$: segobjno 2, tabno 3, file 1  block 25
 ICOL$: segobjno 2, tabno 4, file 1  block 25
 LOB$: segobjno 2, tabno 6, file 1  block 25
 COLTYPE$: segobjno 2, tabno 7, file 1  block 25
 TYPE$: segobjno 181, tabno 1, file 1  block 1297
 COLLECTION$: segobjno 181, tabno 2, file 1  block 1297
 ATTRIBUTE$: segobjno 181, tabno 3, file 1  block 1297
 LOBFRAG$: segobjno 299, file 1 block 2393
 LOBCOMPPART$: segobjno 302, file 1 block 2425
 UNDO$: segobjno 15, file 1 block 105
 TS$: segobjno 6, tabno 2, file 1  block 57
 PROPS$: segobjno 96, file 1 block 721
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$
DUL: Warning: Recreating file "OBJ.ctl"
   50930 rows unloaded
. unloading table                      TAB$
DUL: Warning: Recreating file "TAB.ctl"
    1593 rows unloaded
. unloading table                      COL$
DUL: Warning: Recreating file "COL.ctl"
   55163 rows unloaded
. unloading table                     USER$
DUL: Warning: Recreating file "USER.ctl"
      61 rows unloaded
. unloading table                  TABPART$      90 rows unloaded
. unloading table                  INDPART$      99 rows unloaded
. unloading table               TABCOMPART$       0 rows unloaded
. unloading table               INDCOMPART$       0 rows unloaded
. unloading table               TABSUBPART$       0 rows unloaded
. unloading table               INDSUBPART$       0 rows unloaded
. unloading table                      IND$    2251 rows unloaded
. unloading table                     ICOL$    3669 rows unloaded
. unloading table                      LOB$     537 rows unloaded
. unloading table                  COLTYPE$    1702 rows unloaded
. unloading table                     TYPE$    1886 rows unloaded
. unloading table               COLLECTION$     552 rows unloaded
. unloading table                ATTRIBUTE$    7051 rows unloaded
. unloading table                  LOBFRAG$       1 row  unloaded
. unloading table              LOBCOMPPART$       0 rows unloaded
. unloading table                     UNDO$      21 rows unloaded
. unloading table                       TS$       7 rows unloaded
. unloading table                    PROPS$      27 rows unloaded
Reading USER.dat 61 entries loaded
Reading OBJ.dat 50930 entries loaded and sorted 50930 entries
Reading TAB.dat 1593 entries loaded
Reading COL.dat 55163 entries loaded and sorted 55163 entries
Reading TABPART.dat 90 entries loaded and sorted 90 entries
Reading TABCOMPART.dat 0 entries loaded and sorted 0 entries
Reading TABSUBPART.dat 0 entries loaded and sorted 0 entries
Reading INDPART.dat 99 entries loaded and sorted 99 entries
Reading INDCOMPART.dat 0 entries loaded and sorted 0 entries
Reading INDSUBPART.dat 0 entries loaded and sorted 0 entries
Reading IND.dat 2251 entries loaded
Reading LOB.dat 537 entries loaded
Reading ICOL.dat 3669 entries loaded
Reading COLTYPE.dat 1702 entries loaded
Reading TYPE.dat 1886 entries loaded
Reading ATTRIBUTE.dat 7051 entries loaded
Reading COLLECTION.dat 552 entries loaded
Reading BOOTSTRAP.dat 57 entries loaded
Reading LOBFRAG.dat 1 entries loaded and sorted 1 entries
Reading LOBCOMPPART.dat 0 entries loaded and sorted 0 entries
Reading UNDO.dat 21 entries loaded
Reading TS.dat 7 entries loaded
Reading PROPS.dat 27 entries loaded
Database character set is ZHS16GBK
Database national character set is AL16UTF16

5.导出某种表

DUL> desc chf.t_xifenfei;
Table CHF.T_XIFENFEI
obj#= 52189, dataobj#= 52189, ts#= 4, file#= 4, block#=123
      tab#= 0, segcols= 2, clucols= 0
Column information:
icol# 01 segcol# 01           ID len   22 type  2 NUMBER(0,-127)
icol# 02 segcol# 02         NAME len  100 type  1 VARCHAR2 cs 852(ZHS16GBK)
DUL> UNLOAD TABLE chf.t_xifenfei;
. unloading table                T_XIFENFEI       2 rows unloaded

6.验证导出dmp文件

[oracle@xifenfei dul]$ strings  CHF_T_XIFENFEI.dmp
EXPORT:V07.00.07
UBernard's DUL
RTABLES
1024
                                                Direct UnLoader(C) in EXPort mode
TABLE "T_XIFENFEI"
CREATE TABLE "T_XIFENFEI"("ID" NUMBER,"NAME" VARCHAR2(100))
INSERT INTO "T_XIFENFEI" ("ID", "NAME") VALUES (:1, :2)
www.xifenfei.com
WWW.XIFENEI.COM
EXIT

关于DBMS_SCHEDULER基础

长期以来,一直对DBMS_SCHEDULER包比较模糊,今天抽一点时间,通过一点试验,理清自己的思路,分清楚各个函数大概作用.不至于在以后使用该包的时候一片空白.
1.通过DBMS_SCHEDULER.CREATE_JOB直接创建job

SQL> create table t_xifenfei (x_type varchar2(10),x_date date);
表已创建。
SQL> begin
  2  DBMS_SCHEDULER.create_job (
  3  job_name => 'f_create_job',
  4  job_type => 'PLSQL_BLOCK',
  5  job_action => '
  6   begin
  7   insert into t_xifenfei values(''job'',sysdate);
  8   commit;
  9   end;
 10  ',
 11  enabled => true,
 12  start_date => SYSTIMESTAMP,
 13  repeat_interval => 'SYSTIMESTAMP + 1/1440',
 14  comments => 'xifenfei_create_job');
 15  END;
 16  /
SQL> select x_type,to_char(x_date,'yyyy-mm-dd hh24:mi:ss') from t_xifenfei;
X_TYPE     TO_CHAR(X_DATE,'YYY
---------- -------------------
job        2012-06-19 19:52:11
job        2012-06-19 19:53:11
job        2012-06-19 19:54:11

这里的使用方法和dbms_jobs有几分类此,不过这个提供了加灵活的使用方法,比如可以执行匿名块,执行操作系统命令等

2.CREATE_JOB结合CREATE_PROGRAM

SQL>  create or replace procedure p_xifenfei(in_type in varchar2)
  2   is
  3   begin
  4   insert into t_xifenfei values(in_type,sysdate);
  5   commit;
  6   end;
  7   /
过程已创建。
SQL> begin
  2  DBMS_SCHEDULER.CREATE_PROGRAM(
  3  program_name => 'x_program',
  4  program_action => 'p_xifenfei',
  5  program_type => 'STORED_PROCEDURE',
  6  number_of_arguments => 1,
  7  comments => 'xifenfei_PROGRAM',
  8  enabled => false);
  9  end;
 10  /
PL/SQL 过程已成功完成。
SQL> begin
  2  DBMS_SCHEDULER.define_program_argument(
  3  program_name => 'x_program',
  4  argument_position => 1,
  5  argument_type => 'VARCHAR2',
  6  default_value => 'program');
  7  END;
  8  /
PL/SQL 过程已成功完成。
SQL>  exec DBMS_SCHEDULER.enable('x_program');
PL/SQL 过程已成功完成。
SQL> begin
  2  DBMS_SCHEDULER.create_job(
  3  job_name => 's_xifenfei_job',
  4  program_name => 'x_program',
  5  comments => 's_xifenfei_job',
  6  repeat_interval => 'SYSTIMESTAMP + 1/1440',
  7  auto_drop => false,
  8  enabled => true);
  9  end;
 10  /
PL/SQL 过程已成功完成。
SQL> select x_type,to_char(x_date,'yyyy-mm-dd hh24:mi:ss') from t_xifenfei;
X_TYPE     TO_CHAR(X_DATE,'YYY
---------- -------------------
job        2012-06-19 20:27:11
program    2012-06-19 20:27:09
program    2012-06-19 20:28:09
job        2012-06-19 20:28:11

这里可以看出来CREATE_PROGRAM是把CREATE_JOB中的部分参数给独立出来,使得更加灵活的控制,比如这里的使用从参数

3.CREATE_JOB结合CREATE_PROGRAM和CREATE_SCHEDULE

SQL> exec DBMS_SCHEDULER.drop_job('s_xifenfei_job');
PL/SQL 过程已成功完成。
SQL> truncate table t_xifenfei;
表被截断。
SQL> begin
  2  DBMS_SCHEDULER.create_schedule(
  3  repeat_interval => 'FREQ=MINUTELY;INTERVAL=1',
  4  start_date => sysdate,
  5  comments => 'xifenfei_sch',
  6  schedule_name => 'X_SCH');
  7  end;
  8  /
PL/SQL 过程已成功完成。
SQL> begin
  2  DBMS_SCHEDULER.create_job(
  3  job_name => 't_xifenfei_job',
  4  program_name => 'x_program',
  5  comments => 't_xifenfei_job',
  6  schedule_name => 'X_SCH',
  7  auto_drop => false,
  8  enabled => true);
  9  end;
 10  /
PL/SQL 过程已成功完成。
SQL> select x_type,to_char(x_date,'yyyy-mm-dd hh24:mi:ss') from t_xifenfei;
X_TYPE     TO_CHAR(X_DATE,'YYY
---------- -------------------
job        2012-06-19 20:39:11
job        2012-06-19 20:37:11
job        2012-06-19 20:38:11
program    2012-06-19 20:39:01
program    2012-06-19 20:40:01

CREATE_SCHEDULE是把执行计划部分从CREATE_JOB独立处理,使得控制力度更大,更加灵活

补充说明:
1.还可以通过创建JOB_CLASS更加灵活的控制资源的使用情况,必须通过修改JOB_CLASS中的resource_consumer_group实现资源控制,service对应到数据库的service可以实现rac中在哪个节点执行等等
2.使用DBMS_SCHEDULER.set_attribute来修改相关属相如:

EXEC DBMS_SCHEDULER.set_attribute('GATHER_STATS_JOB','JOB_CLASS', 'AUTO_TASKS_JOB_CLASS2');
exec dbms_scheduler.set_attribute('WEEKNIGHT_WINDOW','REPEAT_INTERVAL','freq=daily;
byday=MON,TUE,WED,THU,FRI;byhour=2;byminute=0;bysecond=0');

ORACLE在线切换undo表空间

切换undo的一些步骤和基本原则

查看原undo相关参数
SHOW PARAMETER UNDO;
创建新undo空间
create undo tablespace undo_x datafile 'E:\ORACLE\ORADATA\XIFENFEI\undo_xifenfei.dbf' size 10M
autoextend on next 10M maxsize 30G;
查询历史undo是否还有事务(包含回滚事务)
SELECT a.tablespace_name,a.segment_name,b.ktuxesta,b.ktuxecfl,
b.ktuxeusn||'.'||b.ktuxeslt||'.'||b.ktuxesqn trans
FROM dba_rollback_segs a, x$ktuxe b
WHERE a.segment_id = b.ktuxeusn
AND a.tablespace_name = UPPER('&tsname')
AND b.ktuxesta <> 'INACTIVE';
--因为有undo_retention参数,所以不能简单的通过确定该sql无事务就可以删除原undo
切换undo表空间(无论是否有事务,均可以切换[最好是无事务时切换],但是不能直接删除原undo表空间)
alter system set undo_tablespace='undo_x';
alert日志现象,表明原undo还有事务
Sun Jun 17 20:10:45 2012
Successfully onlined Undo Tablespace 7.
[36428] **** active transactions found in undo Tablespace 2 - moved to Pending Switch-Out state.
[36428] active transactions found/affinity dissolution incompletein undo tablespace 2 during switch-out.
ALTER SYSTEM SET undo_tablespace='undo_xifenfei' SCOPE=BOTH;
Sun Jun 17 20:11:38 2012
[36312] **** active transactions found in undo Tablespace 2 - moved to Pending Switch-Out state.
Sun Jun 17 20:16:15 2012
[36312] **** active transactions found in undo Tablespace 2 - moved to Pending Switch-Out state.
--只能表明有事务,就算长时间未出现类似记录,不能证明一定可以删除原undo,因为undo_retention
查询回滚段情况(原undo表空间的回滚段全部offline,可以删除相关表空间)
select tablespace_name,segment_name,status from dba_rollback_segs;
离线原undo表空间
alter tablespace undotbs1 offline;
确定原undo回滚段全部offline,直接删除
drop tablespace undotbs1 including contents and datafiles;

切换undo表空间一句话:新建undo几乎是任何时候都可以执行切换undo表空间命令,如果要删除历史undo需要等到该undo空间所有回滚段全部offline.千万别在尚有回滚段处于online状态,强制删除数据文件.

利用flashback database实现部分对象回滚

flashback database功能在生产库中,很少被直接使用,因为没有多少业务可以承受整个数据库级别的回滚.但是如果发生一些让人意想不到的误操作时候,想回滚该操作,我们不得不使用历史的备份来进行不完全恢复.如果没有历史备份,那简直是人生一个悲剧的发生.这里通过使用结合flashback database,实现flashback table级别不能完成的恢复,而且确保整个数据库的其他数据还是最新.这些操作比如:修改表结构,删除数据库用户等操作.这里通过修改表列的处理思路来展示该功能的使用方法,其他处理方法类此
1.确定启用flashback database功能

SQL> select flashback_on from v$database;
FLASHBACK_ON
------------------
YES
SQL>  show parameter flash
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_flash_cache_file                  string
db_flash_cache_size                  big integer 0
db_flashback_retention_target        integer     1440

2.模拟表结构被修改

SQL> create table t_xifenfei
  2  as
  3  select object_id,object_name from dba_objects;
表已创建。
SQL> alter session set nls_date_format='DD-MON-YYYY HH24:MI:SS';
会话已更改。
SQL>  select sysdate from dual;
SYSDATE
-------------------------
17-6月 -2012 15:25:24
SQL> ALTER TABLE t_xifenfei drop column object_name;
表已更改。

3.尝试flashback query功能

SQL> SELECT * FROM t_xifenfei as of timestamp to_timestamp('2012-06-17 15:25:24','yyyy-mm-dd hh24:mi:ss');
SELECT * FROM t_xifenfei as of timestamp to_timestamp('2012-06-17 15:25:24','yyyy-mm-dd hh24:mi:ss')
              *
第 1 行出现错误:
ORA-01466: 无法读取数据 - 表定义已更改
--这个证明因为ddl操作发生在表上,无法使用flashback table/query等操作

4.尝试flashback database

SQL> shutdown immediate
数据库已经关闭。
已经卸载数据库。
ORACLE 例程已经关闭。
SQL> STARTUP MOUNT;
ORACLE 例程已经启动。
Total System Global Area  535662592 bytes
Fixed Size                  1385840 bytes
Variable Size             390072976 bytes
Database Buffers          138412032 bytes
Redo Buffers                5791744 bytes
数据库装载完毕。
SQL>  flashback database to timestamp to_date('2012-06-17 15:25:24','yyyy-mm-ddhh24:mi:ss');
闪回完成。
SQL> alter database open read only;
数据库已更改。
SQL> DESC CHF.T_XIFENFEI
 名称                                      是否为空? 类型
 ----------------------------------------- -------- ----------------------------
 OBJECT_ID                                          NUMBER
 OBJECT_NAME                                        VARCHAR2(128)

5.导出需要回滚对象

C:\Users\XIFENFEI>EXP chf/xifenfei tables=t_xifenfei file=d:\t_xifenfei.dmp
>log=d:\t_xifenfei.log
Export: Release 11.2.0.3.0 - Production on 星期日 6月 17 15:40:37 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
已导出 ZHS16GBK 字符集和 AL16UTF16 NCHAR 字符集
即将导出指定的表通过常规路径...
. . 正在导出表                      T_XIFENFEI导出了       75270 行
成功终止导出, 没有出现警告。

6.恢复数据库至最新状态

SQL> shutdown immediate
数据库已经关闭。
已经卸载数据库。
ORACLE 例程已经关闭。
SQL>  startup mount
ORACLE 例程已经启动。
Total System Global Area  535662592 bytes
Fixed Size                  1385840 bytes
Variable Size             390072976 bytes
Database Buffers          138412032 bytes
Redo Buffers                5791744 bytes
数据库装载完毕。
SQL> recover database;
完成介质恢复。
SQL> alter database open;
数据库已更改。
SQL> desc chf.t_xifenfei
 名称                                      是否为空? 类型
 ----------------------------------------- -------- ----------------------------
 OBJECT_ID                                          NUMBER

7.导入正确数据

SQL> drop table chf.t_xifenfei purge;
表已删除。
SQL> host imp chf/xifenfei tables=t_xifenfei file=d:\t_xifenfei.dmp
>log=d:\t_xifenfei.log
Import: Release 11.2.0.3.0 - Production on 星期日 6月 17 15:45:53 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
经由常规路径由 EXPORT:V11.02.00 创建的导出文件
已经完成 ZHS16GBK 字符集和 AL16UTF16 NCHAR 字符集中的导入
. 正在将 CHF 的对象导入到 CHF
. 正在将 CHF 的对象导入到 CHF
. . 正在导入表                    "T_XIFENFEI"导入了       75270 行
成功终止导入, 没有出现警告。
SQL> desc chf.t_xifenfei
 名称                                      是否为空? 类型
 ----------------------------------------- -------- ----------------------------
 OBJECT_ID                                          NUMBER
 OBJECT_NAME                                        VARCHAR2(128)

使用asm disk header 自动备份信息恢复异常asm disk header

通过参考kamus的Where is the backup of ASM disk header block,发现从10.2.0.5开始的asm确实存在自动备份asm disk header功能.有了这个功能对于那些不备份asm disk header的同学,提供了一层保证,也增加了asm的安全性.
对于10.2.0.5.0以及以后版本,不管au size是多少,asm disk header自动备份存储的位置是第2个au的倒数第2个block.
计算方法:AU中包含的block num[AU_SIZE/block_size]*2-2[因为从第一个块从0计数],通过该方法计算结论为:
1M AU在510
2M AU在1022
4M AU在2046
8M AU在4094
16M AU在8190
32M AU在16382
64M AU在32766
1.对比备份asm disk header

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - Prod
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production
SQL> select to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') "xifenfei.com"  from dual;
xifenfei.com
-------------------
2012-06-17 09:41:19
SQL>  select group_number,DISK_NUMBER,PATH,HEADER_STATUS
   2  from v$asm_disk where group_number<>0;
GROUP_NUMBER DISK_NUMBER PATH            HEADER_STATU
------------ ----------- --------------- ------------
           1           1 /dev/raw/raw2   MEMBER
           1           0 /dev/raw/raw1   MEMBER
SQL> select group_number,name,BLOCK_SIZE,ALLOCATION_UNIT_SIZE from v$asm_diskgroup;
GROUP_NUMBER NAME                           BLOCK_SIZE ALLOCATION_UNIT_SIZE
------------ ------------------------------ ---------- --------------------
           1 DATA                                 4096              1048576
rac1->  kfed read /dev/raw/raw1 blknum=510|>/tmp/xifenfei.510
rac1->  kfed read /dev/raw/raw1 blknum=0|>/tmp/xifenfei.0
rac1-> ll /tmp/xifenfei*
-rw-r--r--  1 oracle oinstall 6606 Jun 14 04:11 /tmp/xifenfei.0
-rw-r--r--  1 oracle oinstall 6606 Jun 14 04:12 /tmp/xifenfei.510
rac1-> diff /tmp/xifenfei.510 /tmp/xifenfei.0
--通过对比发现两者无不同记录返回,证明他们记录内容完全相同

2.尝试破坏asm disk header

rac1-> dd if=/dev/zero of=/dev/raw/raw1 bs=4096 count=1
1+0 records in
1+0 records out
rac1->  kfed read /dev/raw/raw1 blknum=0
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj:                       0 ; 0x008: TYPE=0x0 NUMB=0x0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
SQL> select group_number,DISK_NUMBER,PATH,HEADER_STATUS
   2 from v$asm_disk where group_number<>0;
GROUP_NUMBER DISK_NUMBER PATH            HEADER_STATU
------------ ----------- --------------- ------------
           1           1 /dev/raw/raw2   MEMBER
           1           0 /dev/raw/raw1   CANDIDATE
SQL> alter diskgroup  data dismount;
Diskgroup altered.
SQL> alter diskgroup  data mount;
alter diskgroup  data mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"

3.使用kfed repair修改损坏asm disk header

rac1-> kfed  repair '/dev/raw/raw1'
rac1->  kfed read /dev/raw/raw1 blknum=0
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj:              2147483648 ; 0x008: TYPE=0x8 NUMB=0x0
kfbh.check:                   883602253 ; 0x00c: 0x34aab34d
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
…………
SQL> alter diskgroup  data mount;
Diskgroup altered.

4.使用kfed merge恢复asm disk header

rac1-> dd if=/dev/zero of=/dev/raw/raw1 bs=4096 count=1
1+0 records in
1+0 records out
rac1->  kfed read /dev/raw/raw1 blknum=0
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj:                       0 ; 0x008: TYPE=0x0 NUMB=0x0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
SQL> alter diskgroup  data dismount;
Diskgroup altered.
SQL> alter diskgroup  data mount;
alter diskgroup  data mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"
rac1->  kfed merge /dev/raw/raw1 /tmp/xifenfei.510
SQL> alter diskgroup  data mount;
Diskgroup altered.

通过试验证明在10.2.0.5及其以后版本中,对于备份的asm disk header我们可以通过使用kfed repair和kfed merge来恢复.