ORA-00600[kcfrbd_3]故障解决

朋友一数据库因为断电,被重建控制文件等操作折腾的最后出现在启动的时候出现ORA-00600[kcfrbd_3]

Wed Dec 05 10:26:34 2012
Thread 1 advanced to log sequence 11
Thread 1 opened at log sequence 11
  Current log# 1 seq# 11 mem# 0: E:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1
Wed Dec 05 10:26:34 2012
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Dec 05 10:26:34 2012
SMON: enabling cache recovery
Wed Dec 05 10:26:35 2012
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
Dictionary check complete
Wed Dec 05 10:26:35 2012
SMON: enabling tx recovery
Wed Dec 05 10:26:35 2012
Database Characterset is ZHS16GBK
Wed Dec 05 10:26:35 2012
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_548.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [2], [2279045], [1], [2277120], [2277120], [], []
replication_dependency_tracking turned off (no async multimaster replication found)
Wed Dec 05 10:26:36 2012
Fatal internal error happened while SMON was doing active transaction recovery.
Wed Dec 05 10:26:36 2012
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_548.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [2], [2279045], [1], [2277120], [2277120], [], []
SMON: terminating instance due to error 474

这个错误很明显:数据库已经open成功了,但是因为有事务不能正常被回滚,然后数据库的smon进程异常,从而使得数据库不能正常启动,解决该问题的方法也是很简单,就是常规的undo处理思路(使用人工undo管理,event屏蔽事务,隐含参数屏蔽回滚段),然后重建undo表空间,这个时候可以结合txchecker来检测是否有异常事务:如果有重要基表对象异常,需要重建库;如果是个别其他对象异常,可以通过重建该对象解决

创建包含null值index

在不少时候我们需要对一个值是否是null的查询,根据Oracle的特点,我们单纯在在这个列上创建一个index不能满足这个需求,因为b-tree index中就是不包含null列.通过创建含常数列的复合index可以满足该需求
数据库版本

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

创建测试表

SQL> create table t_xifenfei
  2  as
  3  select * from dba_objects;
Table created.
SQL> desc t_xifenfei;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 OWNER                                              VARCHAR2(30)
 OBJECT_NAME                                        VARCHAR2(128)
 SUBOBJECT_NAME                                     VARCHAR2(30)
 OBJECT_ID                                          NUMBER
 DATA_OBJECT_ID                                     NUMBER
 OBJECT_TYPE                                        VARCHAR2(19)
 CREATED                                            DATE
 LAST_DDL_TIME                                      DATE
 TIMESTAMP                                          VARCHAR2(19)
 STATUS                                             VARCHAR2(7)
 TEMPORARY                                          VARCHAR2(1)
 GENERATED                                          VARCHAR2(1)
 SECONDARY                                          VARCHAR2(1)

创建可能含null列index

SQL> create index ind_object_id on t_xifenfei(object_id);
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'T_XIFENFEI',cascade => true);
PL/SQL procedure successfully completed.

查看执行计划

SQL> SET AUTOT TRACE EXP stat
SQL> SELECT * FROM T_XIFENFEI WHERE OBJECT_ID IS NULL;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 548923532
--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |     1 |    95 |   159   (2)| 00:00:02 |
|*  1 |  TABLE ACCESS FULL| T_XIFENFEI |     1 |    95 |   159   (2)| 00:00:02 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("OBJECT_ID" IS NULL)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        695  consistent gets
          0  physical reads
          0  redo size
        995  bytes sent via SQL*Net to client
        389  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

可以看到我们创建的一个普通的index,查询判断是否为null的时候,没有被应用该index,而是直接使用全表扫描.

创建支持null index

SQL> drop index ind_object_id ;
Index dropped.
SQL> create index ind_object_id on t_xifenfei(object_id,0);
Index created.
SQL>  exec dbms_stats.gather_table_stats(user,'T_XIFENFEI',cascade => true);
PL/SQL procedure successfully completed.

查看执行计划

SQL> SELECT * FROM T_XIFENFEI WHERE OBJECT_ID IS NULL;
Execution Plan
----------------------------------------------------------
Plan hash value: 804765899
---------------------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |     1 |    95 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T_XIFENFEI    |     1 |    95 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_OBJECT_ID |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_ID" IS NULL)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        995  bytes sent via SQL*Net to client
        389  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

这里使用了index,并且执行中逻辑读大幅度下降,很大程度提高了程序的执行效率,逻辑读从695降低为2.

原因分析
建立一个包含列和常数的复合index,可以实现该需求,根据b-tree index的特点,只有当index中包含的列都为null的时候,才不会别在index中记录,因为设置了index中包含的常数列,所以就是列为null,也会被包含在该index中,从而查询null值的时候依然可以使用到该index

v$sgainfo中Free SGA Memory Available的各种情况解释

今天网友问到了v$sgainfo中的Free SGA Memory Available的一些情况,开始我也比较迷糊,为什么会出现Free SGA Memory Available的值不为0,通过查询一些试验和测试,对这个问题进行了有力的说明
数据库版本10G

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

当前SGA各组件信息

SQL> select name,bytes/1024/1024 from v$sgainfo;
NAME                             BYTES/1024/1024
-------------------------------- ---------------
Fixed SGA Size                        1.20853043
Redo Buffers                           6.7890625
Buffer Cache Size                            192
Shared Pool Size                              88
Large Pool Size                                4
Java Pool Size                                 4
Streams Pool Size                              8
Granule Size                                   4
Maximum SGA Size                             304
Startup overhead in Shared Pool               36
Free SGA Memory Available                      0
11 rows selected.

sga配置

SQL> show parameter sga;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ---------------
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 304M
sga_target                           big integer 304M

在10g中,当采用asmm管理时,如果sga_max_size=sga_target,则Free SGA Memory Available为0

修改sga_target

SQL> alter system set sga_target=290M;
System altered.

再次查询v$sgainfo

SQL> select name,bytes/1024/1024 from v$sgainfo;
NAME                             BYTES/1024/1024
-------------------------------- ---------------
Fixed SGA Size                        1.20853043
Redo Buffers                           6.7890625
Buffer Cache Size                            180
Shared Pool Size                              88
Large Pool Size                                4
Java Pool Size                                 4
Streams Pool Size                              8
Granule Size                                   4
Maximum SGA Size                             304
Startup overhead in Shared Pool               36
Free SGA Memory Available                     12
11 rows selected.

再次查看sga_target值

SQL> show parameter sga;
NAME                                 TYPE        VALUE
------------------------------------ ----------- --------------
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 304M
sga_target                           big integer 292M

在10g中,当采用asmm管理时,如果sga_max_size>sga_target,则Free SGA Memory Available为sga_max_size-sga_target

找出sga_target修改为292M而不是290M原因

SQL> select component, granule_size from v$sga_dynamic_components;
COMPONENT                                                        GRANULE_SIZE
---------------------------------------------------------------- ------------
shared pool                                                           4194304
large pool                                                            4194304
java pool                                                             4194304
streams pool                                                          4194304
DEFAULT buffer cache                                                  4194304
KEEP buffer cache                                                     4194304
RECYCLE buffer cache                                                  4194304
DEFAULT 2K buffer cache                                               4194304
DEFAULT 4K buffer cache                                               4194304
DEFAULT 8K buffer cache                                               4194304
DEFAULT 16K buffer cache                                              4194304
DEFAULT 32K buffer cache                                              4194304
ASM Buffer Cache                                                      4194304
13 rows selected.
SQL> select 4194304/1024/1024 from dual;
4194304/1024/1024
-----------------
                4
SQL> select 292/4 from dual;
     292/4
----------
        73

因为sga的内存分配是按照GRANULE为单位进行的,而在该库中sga对应组件的GRANULE为4M,所以我们修改的290M的最近的GRANULE整数倍为292M

sga_target为0的情况

SQL> show parameter sga_target;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------
sga_target                           big integer 0
SQL> select name,bytes/1024/1024 from v$sgainfo;
NAME                             BYTES/1024/1024
-------------------------------- ---------------
Fixed SGA Size                        1.20846176
Redo Buffers                           6.7890625
Buffer Cache Size                            180
Shared Pool Size                              88
Large Pool Size                                4
Java Pool Size                                 4
Streams Pool Size                              8
Granule Size                                   4
Maximum SGA Size                             292
Startup overhead in Shared Pool               36
Free SGA Memory Available                      0
11 rows selected.

在10g中,如果不采用asmm内存管理模式,Free SGA Memory Available为0

11g数据库版本

SQL> select * from v$version;
BANNER
-----------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

查询v$sgainfo信息

SQL> select name,bytes/1024/1024 from v$sgainfo;
NAME                             BYTES/1024/1024
-------------------------------- ---------------
Fixed SGA Size                        1.28236008
Redo Buffers                          6.03515625
Buffer Cache Size                             20
Shared Pool Size                             116
Large Pool Size                                4
Java Pool Size                                 4
Streams Pool Size                              8
Shared IO Pool Size                            0
Granule Size                                   4
Maximum SGA Size                      299.320313
Startup overhead in Shared Pool       52.2684898
Free SGA Memory Available                    140

oracle内存相关参数设置

SQL> show parameter memory;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------
hi_shared_memory_address             integer     0
memory_max_target                    big integer 300M
memory_target                        big integer 300M
shared_memory_address                integer     0
SQL> show parameter pga;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------
pga_aggregate_target                 big integer 0

数据库动态组件内存分配

SQL> select COMPONENT,CURRENT_SIZE/1024/1024 CURRENT_SIZE from V$MEMORY_DYNAMIC_COMPONENTS;
COMPONENT                                                        CURRENT_SIZE
---------------------------------------------------------------- ------------
shared pool                                                               116
large pool                                                                  4
java pool                                                                   4
streams pool                                                                8
SGA Target                                                                160
DEFAULT buffer cache                                                       20
KEEP buffer cache                                                           0
RECYCLE buffer cache                                                        0
DEFAULT 2K buffer cache                                                     0
DEFAULT 4K buffer cache                                                     0
DEFAULT 8K buffer cache                                                     0
DEFAULT 16K buffer cache                                                    0
DEFAULT 32K buffer cache                                                    0
Shared IO Pool                                                              0
PGA Target                                                                140
ASM Buffer Cache                                                            0
16 rows selected.

初步怀疑:在11g的amm内存管理模式下Free SGA Memory Available和PGA Target相等

尝试修改pga_aggregate_target值

SQL> alter system set pga_aggregate_target=150M;
System altered.

再次查询v$sgainfo

SQL>  select name,bytes/1024/1024 from v$sgainfo;
NAME                             BYTES/1024/1024
-------------------------------- ---------------
Fixed SGA Size                        1.28236008
Redo Buffers                          6.03515625
Buffer Cache Size                              8
Shared Pool Size                             116
Large Pool Size                                4
Java Pool Size                                 4
Streams Pool Size                              8
Shared IO Pool Size                            0
Granule Size                                   4
Maximum SGA Size                      299.320313
Startup overhead in Shared Pool       52.2684898
Free SGA Memory Available                    152

查询动态组件内存分布

SQL> select COMPONENT,CURRENT_SIZE/1024/1024 CURRENT_SIZE from V$MEMORY_DYNAMIC_COMPONENTS;
COMPONENT                                                        CURRENT_SIZE
---------------------------------------------------------------- ------------
shared pool                                                               116
large pool                                                                  4
java pool                                                                   4
streams pool                                                                8
SGA Target                                                                148
DEFAULT buffer cache                                                        8
KEEP buffer cache                                                           0
RECYCLE buffer cache                                                        0
DEFAULT 2K buffer cache                                                     0
DEFAULT 4K buffer cache                                                     0
DEFAULT 8K buffer cache                                                     0
DEFAULT 16K buffer cache                                                    0
DEFAULT 32K buffer cache                                                    0
Shared IO Pool                                                              0
PGA Target                                                                152
ASM Buffer Cache                                                            0
16 rows selected.

进一步证明:在11g的amm内存管理模式下Free SGA Memory Available和PGA Target相等

设置pga_aggregate_target为150M,PGA Target为152M原因分析

SQL> select COMPONENT,GRANULE_SIZE from V$MEMORY_DYNAMIC_COMPONENTS;
COMPONENT                                                        GRANULE_SIZE
---------------------------------------------------------------- ------------
shared pool                                                           4194304
large pool                                                            4194304
java pool                                                             4194304
streams pool                                                          4194304
SGA Target                                                            4194304
DEFAULT buffer cache                                                  4194304
KEEP buffer cache                                                     4194304
RECYCLE buffer cache                                                  4194304
DEFAULT 2K buffer cache                                               4194304
DEFAULT 4K buffer cache                                               4194304
DEFAULT 8K buffer cache                                               4194304
DEFAULT 16K buffer cache                                              4194304
DEFAULT 32K buffer cache                                              4194304
Shared IO Pool                                                        4194304
PGA Target                                                            4194304
ASM Buffer Cache                                                      4194304
16 rows selected.

原理同上述10g中的sga_target分析,不再重复,主要就是:150M不能被4M整除,所以取最近的152M

整体总结
1.如果不采用asmm和amm,Free SGA Memory Available为0
2.如果采用asmm,当sga_max_size=sga_target,则Free SGA Memory Available为0
3.如果采用asmm,当sga_max_size>sga_target,则Free SGA Memory Available为sga_max_size-sga_target
4.如果采用amm,Free SGA Memory Available和PGA Target相等

WARNING: Subscription for node down event still pending

监听日志中出现很多”WARNING: Subscription for node down event still pending”警告

--监听日志中内容
……
26-NOV-2012 09:35:48 * ping * 0
WARNING: Subscription for node down event still pending
……
26-NOV-2012 09:45:49 * ping * 0
WARNING: Subscription for node down event still pending
……
--监听日志大小
$ ls -l /oracle/product/10g/network/log/listener.log
-rw-r-----    1 oracle   dba       229526148 Nov 26 14:20 /oracle/product/10g/network/log/listener.log
--总计条数
$ grep "Subscription for node down event still pending" \
> /oracle/product/10g/network/log/listener.log|wc -l
   77306

数据库版本和平台

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for IBM/AIX RISC System/6000: Version 10.2.0.5.0 - Productio
NLSRTL Version 10.2.0.5.0 - Production

监听日志配置

$ more /oracle/product/10g/network/admin/listener.ora
SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = /oracle/product/10g)
      (PROGRAM = extproc)
    )
  )
LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = xifenfei)(PORT = 1521))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
    )
  )

解决方法

--监听文件中增加
SUBSCRIBE_FOR_NODE_DOWN_EVENT_<listener_name>=OFF
--reload监听
lsnrctl reload

补充说明

SUBSCRIBE_FOR_NODE_DOWN_EVENT_<listener_name>=OFF will prevent the messages
from being written to the log file and may also prevent the TNS Listener from hanging periodically.
Setting SUBSCRIBE_FOR_NODE_DOWN_<listener_name> to OFF disables
a necessary RAC functionality(Oracle Notification Service[ONS],fast application notification[FAN]).
The above workaround is recommended only for non-RAC environments.
The issue may be present in all 10g and newer installations.

具体参考:372959.1和340091.1

新删除data guard归档日志shell脚本

以前写过删除dataguard归档日志的方法(删除data guard归档日志),但是以前的方法确实不够灵活也不够简便,现在提供最新的一次在客户现场部署的dg删除归档日志的shell脚本

#!/bin/sh
source ~/.bash_profile
grep "Media Recovery Log" $ORACLE_BASE/admin/$ORACLE_SID/bdump/alert_${ORACLE_SID}.log|
\ awk '{print $4}'|sed -e 's/^/rm /' >/tmp/rmarchlog.sh
chmod +x /tmp/rmarchlog.sh
/tmp/rmarchlog.sh
cd $ORACLE_BASE/admin/$ORACLE_SID/bdump
cat alert_${ORACLE_SID}.log >>alert_${ORACLE_SID}.log.bak
echo ''>alert_${ORACLE_SID}.log
rm -f /tmp/rmarchlog.sh
$ORACLE_HOME/bin/rman target / <<XIFENFEI
crosscheck archivelog all;
delete expired archivelog all;
YES
exit;
XIFENFEI

根据alert日志中dg应用日志的信息”Media Recovery Log”信息来删除掉相关的归档日志,可以保证应用过的归档日志都被删除,而没有应用的归档日志都保留.

shell监控dataguard备库是否正常应用日志

一直在思索怎么去监控dg比较方便,又能够做到比较适用.想到了几种方法:
1.使用主备库两边的alert日志,但是这样的方法需要配置ssh,用来一个节点获取另外一个节点的alert日志
2.通过查询v$archived_log或者其他相关视图,然后主备库进行比较,但是这个需要访问另外一个库,需要另外库的登录信息
3.通过查询备库的v$archived_log视图,粗略评估dg是否工作正常.
这里我选择了3,dg的监控大部分时候是为了让人及时的发现日志应用异常,然后人工干预处理,从而减少修改gap或者重建dg的概率.而这个额监控可以在很大程度上发现dg应用归档日志异常,从而确定dg是否工作正常,如果发现工作异常,及时处理,可以减少很多工作量,甚至拯救你的数据.

#!/bin/bash
source ~/.bash_profile
#check time(M)
export CHECK_M=120
export RESULT_FILE=/tmp/dg_switch_check.log
$ORACLE_HOME/bin/sqlplus -silent "/ as sysdba" <<XFF>/tmp/check_dg.log
set pagesize 0 feedback off verify off heading off echo off
 select ceil((sysdate-next_time)*24*60) "M"
 from v\$archived_log
 where applied='YES' AND SEQUENCE#=(SELECT MAX(SEQUENCE#)
 FROM V\$ARCHIVED_LOG WHERE applied='YES');
exit;
XFF
GET_M=`cat /tmp/check_dg.log`
rm /tmp/check_dg.log
if [ ${CHECK_M} -lt ${GET_M} ];
then
    echo "check dataguard time:`date`">$RESULT_FILE
    echo "The last time application archivelog happened in $GET_M minutes ago">>$RESULT_FILE
else
 echo ''>$RESULT_FILE
fi

针对这样的脚本,根据你的dg归档切换的频率,设置监控dg的最近一次日志应用与当前时间差,然后判断dg是否工作正常.根据监控程序的特点,可以通过判断结果集文件,然后邮件/短信或者其他方式处理.

因v$archived_log视图记录异常导致dg MRP进程异常

版本信息

操作系统Linux 4.8 x86
数据库版本ORACLE 9.2.0.4

alert日志报错
MRP进程出现异常报ORA-00310/ORA-00334错误

…………
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_75.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_76.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_77.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_78.dbf
Media Recovery Log /u01/oracle/oradata/xifenfei/redo02.log
MRP0: Background Media Recovery terminated with error 310
Thu Nov  8 07:44:39 2012
Errors in file /u01/oracle/admin/lunar/bdump/lunar_mrp0_25625.trc:
ORA-00310: archived log contains sequence 85; sequence 79 required
ORA-00334: archived log: '/u01/oracle/oradata/xifenfei/redo02.log'
Recovery interrupted.
MRP0: Background Media Recovery process shutdown

trace文件

*** SESSION ID:(17.13) 2012-11-08 07:24:12.986
Background Managed Standby Recovery process started
*** 2012-11-08 07:24:18.023
Managed Recovery: Active posted.
*** 2012-11-08 07:41:03.171
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_64.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_65.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_66.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_67.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_68.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_69.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_70.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_71.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_72.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_73.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_74.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_75.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_76.dbf
*** 2012-11-08 07:41:39.083
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_77.dbf
*** 2012-11-08 07:44:39.171
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_78.dbf
Media Recovery Log /u01/oracle/oradata/xifenfei/redo02.log
Background Media Recovery terminated with error 310
ORA-00310: archived log contains sequence 85; sequence 79 required
ORA-00334: archived log: '/u01/oracle/oradata/xifenfei/redo02.log'
----- Redo read statistics for thread 1 -----
Read rate (ASYNC): 21990Kb in 1221.38s => 0.02 Mb/sec
Longest record: 1Kb, moves: 0/92129 (0%)
Change moves: 34869/213735 (16%), moved: 2Mb
----------------------------------------------
*** 2012-11-08 07:44:39.404
Managed Recovery: Not Active posted.
Background Media Recovery process shutdown
*** 2012-11-08 07:44:39.406

猜想数据库恢复需要sequence为79的归档日志,但是该归档日志对应的为文件为redo02.log,而该redo02的seq为85所以使得MRP进程异常

使用rman尝试恢复

RMAN> recover database ;
Starting recover at 08-NOV-12
using target database controlfile instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=8 devtype=DISK
starting media recovery
unable to find archive log
archive log thread=1 sequence=79
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 11/08/2012 08:12:48
RMAN-06054: media recovery requesting unknown log: thread 1 scn 12286829427051

测试证明rman也无法正常的恢复该异常问题

验证猜想

SQL> select name from v$archived_log where SEQUENCE#=79;
NAME
-------------------------------------------------------------------
/u01/oracle/oradata/lunar/arch/1_79.dbf
/u01/oracle/oradata/xifenfei/redo02.log
/u01/oracle/oradata/lunar/arch/1_79.dbf
SQL> select dest_id,name from v$archived_log where SEQUENCE#=79;
   DEST_ID  NAME
----------  ---------------------------------------------------
         2  /u01/oracle/oradata/lunar/arch/1_79.dbf
         1  /u01/oracle/oradata/xifenfei/redo02.log
         1  /u01/oracle/oradata/lunar/arch/1_79.dbf
SQL> select sequence#,group# from v$log;
 SEQUENCE#     GROUP#
---------- ----------
        86          1
        85          2
        87          3
SQL> select member from v$logfile where group#=2;
MEMBER
---------------------------------------------------------------
/u01/oracle/oradata/xifenfei/redo02.log

通过查询上面相关视图,证实了猜想是因为redo log被注册进入了v$archived_log导致该故障,解决该问题的思路是把redo log file从备库控制文件的v$archived_log视图中拿掉.具体方法是:
1.如果主库正常,那直接生成standby controlfile来实现
2.如果主库也是相同情况,那么先重建主库控制文件,然后重建standby controlfile来实现(该方法需要维护窗口)

解决问题思路

--查询主库,确定主库正常
SQL> select name from v$archived_log where SEQUENCE#=79;
NAME
------------------------------------------------------------
/u01/oracle/oradata/xifenfei/archive/1_79.dbf
lunar
--重新创建standby controlfile
###############################################################
注:如果主库和备库的数据文件路径不完全一致,
建议通过设置db_file_name_convert来直接实现备库数据文件路径的转换,
而不建议通过alter database rename file来实现重命名
###############################################################
SQL> ALTER DATABASE CREATE STANDBY CONTROLFILE AS
   2 '/u01/oracle/oradata/lunar/control01.ctl' reuse;
Database altered.
--重新启动备库
SQL> startup nomount;
ORACLE instance started.
Total System Global Area  353441008 bytes
Fixed Size                   451824 bytes
Variable Size             184549376 bytes
Database Buffers          167772160 bytes
Redo Buffers                 667648 bytes
SQL> alter database mount standby database;
Database altered.
--开启日志应用
SQL> alter database recover managed standby database disconnect from session;
Database altered.
--alert日志
Thu Nov  8 08:28:16 2012
Completed: alter database recover managed standby database di
Thu Nov  8 08:28:20 2012
Restarting dead background process QMN0
QMN0 started with pid=13
Thu Nov  8 08:29:45 2012
Fetching gap sequence for thread 1, gap sequence 79-87
Trying FAL server: xifenfei
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_79.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_80.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_81.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_82.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_83.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_84.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_85.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_86.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_87.dbf
Media Recovery Log /u01/oracle/oradata/lunar/arch/1_88.dbf
Media Recovery Waiting for thread 1 seq# 89

使用dbms_pumpdata执行expdp操作

使用dbms_pumpdata执行expdp操作

set serverout on
declare
  h1 number; -- Datapump handle
  dir_name varchar2(30); -- Directory Name
  job_status VARCHAR2(30);
begin
  dir_name := 'DATA_PUMP_DIR';
  h1 := dbms_datapump.open(
  operation =>'EXPORT', --是export还是impport
--导出表配置
  job_mode =>'TABLE',  --job_mode可以为SCHEMA/TABLE等
--导出用户配置
  job_mode =>'SCHEMA',
  remote_link => NULL, --是否使用dblink导出(就是NETWORK_LINK)
  job_name =>'TABLE_XFF' --job_name expdpjob的名称
  );
  dbms_datapump.add_file(handle =>h1,
                         filename => 'XIFENFEI.DMP',
                         directory => dir_name,
                         filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_DUMP_FILE,
                         reusefile => 1); -- value of 1 instructs to overwrite existing file
  dbms_datapump.add_file(handle =>h1,
                         filename => 'XIFENFEI.LOG',
                         directory => dir_name,
                         filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_LOG_FILE,
                         reusefile => 1);
--导出表配置
  dbms_datapump.metadata_filter(handle =>h1,
                         name => 'TABLE_FILTER',
                         value => 'CHF.T_XIFENFEI');
--导出用户配置
  dbms_datapump.metadata_filter (handle => dp_handle,
                                 name => 'SCHEMA_EXPR',
                                 value => 'IN (''CHF'')');
-- Start the job.
  dbms_datapump.start_job(h1);
  dbms_datapump.wait_for_job (handle => dp_handle,
                              job_state => job_status);
  dbms_output.put_line ('DataPump Export - '||to_char(sysdate,'DD/MM/YYYY HH24:MI:SS')||' Status '||job_status);
  begin
     dbms_datapump.detach(handle => h1);
   end;
end;
/

awrload导入awr数据出现两种常见错误说明

ORA-06502/ORA-06512错误

SQL> @?/rdbms/admin/awrload.sql
~~~~~~~~~~
AWR LOAD
~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~  This script will load the AWR data from a dump file. The   ~
~  script will prompt users for the following information:    ~
~     (1) name of directory object                            ~
~     (2) name of dump file                                   ~
~     (3) staging schema name to load AWR data into           ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Specify the Directory Name
~~~~~~~~~~~~~~~~~~~~~~~~~~
Directory Name                 Directory Path
------------------------------ -------------------------------------------------
DATA_PUMP_DIR                  /u01/oracle/admin/ora11g/dpdump/
ORACLE_OCM_CONFIG_DIR          /u01/oracle/oracle/product/11.2.0/db_1/ccr/state
XMLDIR                         /u01/oracle/oracle/product/11.2.0/db_1/rdbms/xml
Choose a Directory Name from the list above (case-sensitive).
Enter value for directory_name: DATA_PUMP_DIR
Using the dump directory: DATA_PUMP_DIR
Specify the Name of the Dump File to Load
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Please specify the prefix of the dump file (.dmp) to load:
Enter value for file_name: awrdat_70441_70885-vpos-primary
Loading from the file name: awrdat_70441_70885-vpos-primary.dmp
begin
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error: character string buffer too small
ORA-06512: at line 2

因为导出来的awr数据库的编码是AL32UTF8,而现在的编码是ZHS16GBK,所以解决方法是设置NLS_LANG为合适编码

[oracle@xifenfei ~]$ env|grep NLS_LANG
NLS_LANG=AMERICAN_AMERICA.ZHS16GBK
[oracle@xifenfei ~]$ export NLS_LANG=AMERICAN_AMERICA.AL32UTF8

ORA-31640

SQL> @?/rdbms/admin/awrload.sql
~~~~~~~~~~
AWR LOAD
~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~  This script will load the AWR data from a dump file. The   ~
~  script will prompt users for the following information:    ~
~     (1) name of directory object                            ~
~     (2) name of dump file                                   ~
~     (3) staging schema name to load AWR data into           ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Specify the Directory Name
~~~~~~~~~~~~~~~~~~~~~~~~~~
Directory Name                 Directory Path
------------------------------ -------------------------------------------------
DATA_PUMP_DIR                  /u01/oracle/admin/ora11g/dpdump/
ORACLE_OCM_CONFIG_DIR          /u01/oracle/oracle/product/11.2.0/db_1/ccr/state
XMLDIR                         /u01/oracle/oracle/product/11.2.0/db_1/rdbms/xml
Choose a Directory Name from the list above (case-sensitive).
Enter value for directory_name: DATA_PUMP_DIR
Using the dump directory: DATA_PUMP_DIR
Specify the Name of the Dump File to Load
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Please specify the prefix of the dump file (.dmp) to load:
Enter value for file_name: awrdat_70441_70885-vpos-primary  <--注意输入
Loading from the file name: awrdat_70441_70885-vpos-primary.dmp
…………
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|  Loading the AWR data from the following
|  directory/file:
|   /u01/oracle/admin/ora11g/dpdump/
|   awrdat_70441_70885-vpos-primar.dmp    <--提示的文件名
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|  *** AWR Load Started ...
|
|  This operation will take a few moments. The
|  progress of the AWR load operation can be
|  monitored in the following directory/file:
|   /u01/oracle/admin/ora11g/dpdump/
|   awrdat_70441_70885-vpos-primar.log
|
DBMS_DATAPUMP.ADD_FILE(dump file)
ORA-39001: invalid argument value
Exception encountered in AWR_LOAD
begin
*
ERROR at line 1:
ORA-20115: datapump import encountered error:
ORA-39001: invalid argument value
ORA-39000: bad dump file specification
ORA-31640: unable to open dump file
"/u01/oracle/admin/ora11g/dpdump/awrdat_70441_70885-vpos-primar.dmp" for read
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
ORA-06512: at "SYS.DBMS_SWRF_INTERNAL", line 1717
ORA-06512: at line 3
begin
*
ERROR at line 1:
ORA-20106: AWR tables do not exist for the 'AWR_STAGE' user
ORA-06512: at "SYS.DBMS_SWRF_INTERNAL", line 2920
ORA-00942: table or view does not exist
ORA-06512: at line 3
... Dropping AWR_STAGE user
End of AWR Load

这里我们可以看到,dmp文件名为31个字符,而在提示文件名的时候是30个字符,从而出现了dmp文件不存在而导致的相关错误,解决方案重命名dmp文件,名称不超过30个字符