ORA-00600[qmxtriCheckAndRewriteQb0]

数据库报ORA-00600[qmxtriCheckAndRewriteQb0]

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /u01/oracle/product/10.2.0
System name:	AIX
Node name:	abc
Release:	3
Version:	5
Machine:	00C58A644C00
Instance name: XFF2
Redo thread mounted by this instance: 2
Oracle process number: 434
Unix process pid: 492340, image: oracle@abc
*** ACTION NAME:() 2012-11-12 08:46:47.132
*** MODULE NAME:() 2012-11-12 08:46:47.132
*** SERVICE NAME:(ORCL) 2012-11-12 08:46:47.132
*** CLIENT ID:() 2012-11-12 08:46:47.132
*** SESSION ID:(870.58602) 2012-11-12 08:46:47.132
*** 2012-11-12 08:46:47.132
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [qmxtriCheckAndRewriteQb0], [], [], [], [], [], [], []
Current SQL statement for this session:
SELECT EXTRACTVALUE(配置,'//SYSTEM[@XTH="'||:B1 ||'"]/FILE') ,
WHERE EXTRACTVALUE(配置,'//SYSTEM[@XTH="'||:B1 ||'"]/BM')=:B2  AND ROWNUM<2
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
70000021d535f70        25  procedure ZLTOOLS.ZL_MBRUNLOG_INSERT
7000002b6819368         1  anonymous block
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              000000000 ? 000000000 ?
ksedmp+0290          bl       ksedst               104A2C690 ?
ksfdmp+0018          bl       03F26C3C
kgerinv+00dc         bl       _ptrgl
kgeasnmierr+004c     bl       kgerinv              7000002F735A838 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   0FFFFBFFF ?
IPRA.$qmxtriCheckAn  bl       03F25970
dRewriteQb_rec+0194
IPRA.$qmxtriCheckAn  bl       IPRA.$qmxtriCheckAn  1000881EC ? 000000000 ?
dRewriteQb_rec+006c           dRewriteQb_rec       000000000 ?
IPRA.$qmxtriCheckAn  bl       IPRA.$qmxtriCheckAn  FFFFFFFFFFF07E0 ? 000000033 ?
dRewriteQb_rec+006c           dRewriteQb_rec       1056037F8 ?
qmxtriCheckAndRewri  bcl      dmqlKMlod+00c0       000000000 ? 110421CB0 ?
teQb+0094                                          FFFFFFFFFFE87C0 ?
qmxtrxq+0210         bl       03F252EC
qmxtrxop+00a4        bl       qmxtrxq              FFFFFFFFFFF25B8 ?
                                                   700000282F66DD0 ? 110195E98 ?
koksspend+02b0       bl       qmxtrxop             100346AB4 ?
kkmdrvend+01a8       bl       koksspend            000000001 ? 104B3A8A8 ?
                                                   000000000 ?
kkmdrv+004c          bl       kkmdrvend            FFFFFFFFFFE8BE0 ?
                                                   883843401048F2F8 ?
opiSem+13c0          bl       kkmdrv               000000000 ? 000000000 ?
                                                   000000000 ? 11022AC50 ?
opiDeferredSem+0234  bl       opiSem               FFFFFFFFFFE9CE0 ?
                                                   7000001E327CCE0 ? 000000111 ?
                                                   100000001 ?
opitca+01e8          bl       opiDeferredSem
kksFullTypeCheck+00  bl       03F25230
1c
rpiswu2+034c         bl       _ptrgl
kksSetBindType+0d28  bl       rpiswu2              70000030850C178 ?
                                                   3300000033 ?
                                                   FFFFFFFFFFF0570 ?
                                                   FFFFFFFFFFF0578 ?
                                                   7000002F6F0C700 ?
                                                   33104027D8 ?
                                                   FFFFFFFFFFF1F48 ? 000000000 ?
kksfbc+1054          bl       kksSetBindType       70000030F58F400 ? 1107CB418 ?
                                                   70000001003B800 ?
                                                   10200003000 ? 110000FF8 ?
                                                   7000000100ECAB8 ?
                                                   FFFFFFFFFFF1480 ?
                                                   481A408400003000 ?
opiexe+098c          bl       01F960BC
opipls+185c          bl       opiexe               FFFFFFFFFFF3900 ?
                                                   FFFFFFFFFFF39E8 ?
                                                   FFFFFFFFFFF38A0 ?
opiodr+0ae0          bl       _ptrgl
rpidrus+01bc         bl       opiodr               66FFFF54B0 ? 608736A20 ?
                                                   FFFFFFFFFFF67C0 ?
                                                   1510195E98 ?
skgmstack+00c8       bl       _ptrgl
rpidru+0088          bl       skgmstack            102320840 ? 000000000 ?
                                                   000000002 ? 000000000 ?
                                                   FFFFFFFFFFF5F88 ?
rpiswu2+034c         bl       _ptrgl
rpidrv+095c          bl       rpiswu2              70000030850C178 ? 110469C28 ?
                                                   11044AA58 ? 000000000 ?
                                                   FFFFFFFFFFF5D60 ?
                                                   3300000000 ? 000000000 ?
                                                   000000000 ?
psddr0+02bc          bl       03F266D4
psdnal+01d0          bl       psddr0               1500000000 ? 6600000000 ?
                                                   FFFFFFFFFFF67C0 ?
                                                   30100BACC8 ?
pevm_EXECC+01f8      bl       _ptrgl
pfrinstr_EXECC+0070  bl       pevm_EXECC           10147B2A4 ? 000000000 ?
                                                   700000262828B72 ?
pfrrun_no_tool+005c  bl       _ptrgl
pfrrun+1014          bl       pfrrun_no_tool       FFFFFFFFFFF6B20 ?
                                                   7000002B6819368 ? 3100ECBB0 ?
plsql_run+06b4       bl       pfrrun               1107D84A8 ?
peicnt+0224          bl       plsql_run            1107D84A8 ? 10001102676F8 ?
                                                   000000000 ?
kkxexe+0250          bl       peicnt               FFFFFFFFFFF7E38 ? 1107D84A8 ?
opiexe+2ef8          bl       kkxexe               11047E1C8 ?
kpoal8+0edc          bl       opiexe               FFFFFFFFFFFB454 ?
                                                   FFFFFFFFFFFB1A8 ?
                                                   FFFFFFFFFFF9628 ?
opiodr+0ae0          bl       _ptrgl
ttcpip+1020          bl       _ptrgl
opitsk+1124          bl       01F96AC8
opiino+0990          bl       opitsk               0FFFFD490 ? 000000000 ?
opiodr+0ae0          bl       _ptrgl
opidrv+0484          bl       01F95914
sou2o+0090           bl       opidrv               3C02D99B7C ? 4A076D928 ?
                                                   FFFFFFFFFFFF390 ?
opimai_real+01bc     bl       01F93294
main+0098            bl       opimai_real          000000000 ? 000000000 ?
__start+0098         bl       main                 000000000 ? 000000000 ?
--------------------- Binary Stack Dump ---------------------

通过这个trace的部分信息可以得到:
1.操作系统版本AIX x64(5.3)
2.数据库版本10.2.0.4
3.sql语句调用EXTRACTVALUE函数
4.Call Stack Trace信息

查询MOS[ID 467350.1]发现匹配信息

Cause
Bug 6030982 ORA-600 [QMXTRICHECKANDREWRITEQB0] WITH QUERY USING EXTRACTVALUE FUNCTION
Solution
This bug is going to be fixed in furture 10.2.0.5.0 and 11g
At the mean time , user can workaround by
set
event = "19027 trace name context forever, level 1"
within init.ora or spfile file then bounce database.
or
SQL> alter session set events ='19027 trace name context forever, level 1';
SQL> Alter system flush shared_pool;
-- Execute affected query

通过mos可以确定:
1.是因为数据库执行EXTRACTVALUE函数遇到该bug
2.在11g和10.2.0.5中修复该bug
3.可以通过设置event = “19027 trace name context forever, level 1″来临时解决该问题

个人处理建议
1.如果数据库方便升级,那建议升级处理
2.如果数据库不便立马升级,建议在业务低估时设置session event 19027,然后 flush shared_pool,执行报错sql,如果问题解决,在合适时间设置system event来临时屏蔽该问题.

数据文件的CREATION_TIME来源和算法

对ORACLE比较熟悉的人都知道v$datafile.CREATION_TIME和v$datafile_header.CREATION_TIME这两个列都是表示数据文件的创建时间,而根据我们的经验可以知道几点:
1.当v$datafile.CREATION_TIME与v$datafile_header.CREATION_TIME不一致时数据库不能正常启动
2.v$datafile.CREATION_TIME的值来源于v$datafile_header.CREATION_TIME
3.而v$datafile_header.CREATION_TIME的值来源于数据文件头的块中的信息

现在就出现一个问题,数据块中的kcvfhcrt是一个16进制的数,如何实现在v$datafile和v$datafile_header中转为为了数据文件创建的日期
数据文件中存储创建数据文件日期内容

ub4 kcvfhcrt                             @108      0x2c67319c

v$datafile.CREATION_TIME值

SQL> select to_char(CREATION_TIME,'yyyy-mm-dd hh24:mi:ss') xifenfei
   2 from v$datafile where file#=1;
XIFENFEI
-------------------
2011-03-05 05:26:52

如何通过kcvfhcrt值推算出来CREATION_TIME或者通过CREATION_TIME推断出来kcvfhcrt的值规则:
熟悉数据库SCN计数原理的人都知道,我们现在使用的数据库是从1988/01/01 00:00:00开始记录SCN,也就是说我们的数据库的使用最早时间只能是从1988年元旦凌晨开始,那么也就是说数据库记录的创建时间可以采用这个时间点为起点,然后每增加一秒,数据库的kcvfhcrt就增加1,但是ORACLE为了计算简便,每个月按照31天计算
通过时间推算出来kcvfhcrt值

--数据库记录时间起点
1988/01/01 00:00:00
--当前数据文件创建日志
2011/03/05 05:26:52
--两者相差时间
23年02月04日05时26分52秒
--计算相差秒
23*12*31*24*60*60+2*31*24*60*60+4*24*60*60+5*60*60+26*60+52=744960412
--kcvfhcrt值转换
2c67319c(16进制)=744960412(10进制)

通过kcvfhcrt计算CREATION_TIME值

SQL> select to_number('2c67319c','xxxxxxxxxxx') from dual;
TO_NUMBER('2C67319C','XXXXXXXXXXX')
-----------------------------------
                          744960412
SQL> select 744960412/(12*31*24*60*60) from dual;
744960412/(12*31*24*60*60)
--------------------------
                23.1780295
SQL> select mod(744960412,(12*31*24*60*60)) from dual;
MOD(744960412,(12*31*24*60*60))
-------------------------------
                        5722012
SQL> select 5722012/(31*24*60*60) from dual;
5722012/(31*24*60*60)
---------------------
           2.13635454
SQL> select mod(5722012,(31*24*60*60)) from dual;
MOD(5722012,(31*24*60*60))
--------------------------
                    365212
SQL> select 365212/(24*60*60) from dual;
365212/(24*60*60)
-----------------
       4.22699074
SQL> select mod(365212,(24*60*60)) from dual;
MOD(365212,(24*60*60))
----------------------
                 19612
SQL> select 19612/(60*60) from dual;
19612/(60*60)
-------------
   5.44777778
SQL> select mod(19612,(60*60)) from dual;
MOD(19612,(60*60))
------------------
              1612
SQL> select 1612/60 from dual;
   1612/60
----------
26.8666667
SQL> select mod(1612,60) from dual;
MOD(1612,60)
------------
          52
从这里可以得出23年2月4天5时26分52秒,与1988年01月01日00时00分00秒相加得到
2011年03月05日 5:26:52
SQL> select to_char(CREATION_TIME,'yyyy-mm-dd hh24:mi:ss') from v$datafile where file#=1;
TO_CHAR(CREATION_TI
-------------------
2011-03-05 05:26:52

bbed打开丢失部分system数据文件库

在某种情况下,数据库system表空间可能有多个数据文件,而意外的丢失了其中某个(不能为第一个),然后通过bbed来模拟一个数据文件来open库
system增加数据文件

SQL> alter tablespace system add datafile '/u01/oracle/oradata/ora11g/system02.dbf' size 10m;
Tablespace altered.
--创建表,为了使得数据库发生类此生产环境的部分操作,使得system表空间可能发生改变
SQL> create table t_xifenfei tablespace system
  2  as
  3  select * from dba_tables;
Table created.

删除system中某个文件(system02.dbf)

[oracle@xifenfei ora11g]$ mv system02.dbf system02.dbf_bak

尝试启动数据库

SQL> shutdown abort
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area  313860096 bytes
Fixed Size                  1344652 bytes
Variable Size             251661172 bytes
Database Buffers           54525952 bytes
Redo Buffers                6328320 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 8 - see DBWR trace file
ORA-01110: data file 8: '/u01/oracle/oradata/ora11g/system02.dbf'

错误思路offline system数据文件

SQL> alter database datafile 8 offline;
Database altered.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01147: SYSTEM tablespace file 8 is offline
ORA-01110: data file 8: '/u01/oracle/oradata/ora11g/system02.dbf'
SQL> alter database datafile 8 online;
alter database datafile 8 online
*
ERROR at line 1:
ORA-01157: cannot identify/lock data file 8 - see DBWR trace file
ORA-01110: data file 8: '/u01/oracle/oradata/ora11g/system02.dbf'

使用system表空间其他数据文件来模拟丢失数据文件

[oracle@xifenfei ora11g]$ cp system01.dbf system02.dbf

通过dul获取file$相关信息

FILE#
RELFILE#
CRSCNWRP
CRSCNBAS

bbed修改下面参数值

--rdba
 ub4 rdba_kcbh                         @4        0x02000001
--绝对文件号
 ub2 kccfhfno                          @52       0x0008
--scn
 ub4 kscnbas                           @100      0xc01a3581
 ub2 kscnwrp                           @104      0x0b2c
--相对文件号
 ub4 kcvfhrfn                             @368      0x00000008
--文件大小(不修改,为了重建欺骗数据库重建控制文件)
kccfhfsz
--文件创建时间(重建控制文件来实现控制文件和数据文件头一致)
kcvfhcrt

重建控制文件
1.因为复制来自同一个表空间下面的数据文件,数据文件大小和原数据文件一样, 所以不要修改kccfhfsz大小,不然会出现

CREATE CONTROLFILE REUSE DATABASE "ORA11G" NORESETLOGS       ARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-01200: actual file size of 90880 is smaller than correct size of 10485760 blocks
ORA-01110: data file 8: '/u01/oracle/oradata/ora11g/system02.dbf'

2. 数据文件创建时间是通过kcvfhcrt参数值来控制的,而这个值是通过1988年01月01日00时00分00秒开始计时,按照每月31天计算的累计值,按照这个规则可以推断出来kcvfhcrt.因为数据库在启动的时候会验证控制文件中这个值和数据文件头的该值是否一致,所以如果你不修改kcvfhcrt,可以选择重建控制文件来完成.

再次open数据库

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/u01/oracle/oradata/ora11g/system01.dbf'
SQL> recover database;
Media recovery complete.
SQL> alter database open;
Database altered.

操作到这里,库已经可以正常的被open,如果通过这种方面屏蔽掉的异常的system数据文件中数据字典的部分表信息时,可能数据库依然不能被正常逻辑导出(例如dba_segments,dba_extents的基表等),需要进一步特殊处理,如果不能自行解决相关问题,需要恢复支持,请联系我们
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

解决Statspack报告时Snap Id为"#####"

生成Statspack报告时候发现Snap Id为”#####”

Instances in this Statspack schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   DB Id    Inst Num DB Name      Instance     Host
----------- -------- ------------ ------------ ------------
  631690435        1 VODAPP       vodapp       T5440-1
Enter value for dbid: 631690435
Using 631690435 for database Id
Enter value for inst_num: 1
Using 1 for instance number
Completed Snapshots
                           Snap                    Snap
Instance     DB Name         Id   Snap Started    Level Comment
------------ ------------ ----- ----------------- ----- ----------------------
vodapp       VODAPP       ##### 16 Oct 2012 17:00     5
                          ##### 16 Oct 2012 18:00     5
                          ##### 16 Oct 2012 19:00     5
                          ##### 16 Oct 2012 20:00     5
                          ##### 16 Oct 2012 21:00     5
                          ##### 16 Oct 2012 22:00     5
                          ##### 16 Oct 2012 23:00     5
                          …………
Specify the Begin and End Snapshot Ids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter value for begin_snap:

因为没有办法定位到Snap Id,所以暂时无法准确的输入对应值,当然可以通过如下sql查询相应的Snap Id

select SNAP_ID,to_char(SNAP_TIME,'yyyy-mm-dd hh24:mi:ss') SNAP_TIME
from STATS$SNAPSHOT order by SNAP_ID;

虽然可以通过这个方面来曲线解决这个问题,但是还有比较完善一点的解决方法,我们阅读spcreate.sql相关脚本,修改相关程序来实现

--通过spcreate.sql发现
spcreate.sql调用@@sprepins
--编辑sprepins.sql
column instart_fmt noprint;
column inst_name   format a12  heading 'Instance';
column db_name     format a12  heading 'DB Name';
column snap_id     format 9990 heading 'Snap|Id';
-->(修改为)column snap_id     format 999990 heading 'Snap|Id';
column snapdat     format a17  heading 'Snap Started' just c;
column lvl         format 99   heading 'Snap|Level';
column commnt      format a22  heading 'Comment';

再次生成sp报告

SQL> @?/rdbms/admin/sprepins.sql
Instances in this Statspack schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   DB Id    Inst Num DB Name      Instance     Host
----------- -------- ------------ ------------ ------------
  631690435        1 VODAPP       vodapp       T5440-1
Enter value for dbid: 631690435
Using 631690435 for database Id
Enter value for inst_num: 1
Using 1 for instance number
Completed Snapshots
                             Snap                    Snap
Instance     DB Name           Id   Snap Started    Level Comment
------------ ------------ ------- ----------------- ----- ----------------------
vodapp       VODAPP         58916 16 Oct 2012 17:00     5
                            58917 16 Oct 2012 18:00     5
                            58918 16 Oct 2012 19:00     5
                            58919 16 Oct 2012 20:00     5
                            58920 16 Oct 2012 21:00     5
                            58921 16 Oct 2012 22:00     5
                            58922 16 Oct 2012 23:00     5
                            58923 17 Oct 2012 00:00     5

ORA-00600[3689]引起MRP进程异常

今天检查数据库发现一数据库因为添加数据文件导致ORA-00600[3689]错误,进入引起MRP进程异常,从而使得数据库可以正常接收归档日志,但是不能在备库应用归档日志
数据库版本

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for Solaris: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 – Production

检查日志

Fri Oct 19 12:00:22 2012
RFS[1]: No standby redo logfiles created
RFS[1]: Archived Log: '/baceldba/archive/1_30374_730819916.dbf'
Fri Oct 19 12:00:28 2012
Media Recovery Log /baceldba/archive/1_30374_730819916.dbf
WARNING: File being created with same name as in Primary
Existing file may be overwritten
Fri Oct 19 12:00:50 2012
Recovery created file /data/oracle/oradata/vodapp/NETTVCLPS02.dbf
Successfully added datafile 32 to media recovery
Datafile #32: '/data/oracle/oradata/vodapp/NETTVCLPS02.dbf '
WARNING: File being created with same name as in Primary
Existing file may be overwritten
Fri Oct 19 12:01:15 2012
Recovery created file /data/oracle/oradata/vodapp/UCICMS01_DATA04.dbf
Successfully added datafile 33 to media recovery
Datafile #33: '/data/oracle/oradata/vodapp/UCICMS01_DATA04.dbf'
Fri Oct 19 12:01:16 2012
Errors in file /data/oracle/admin/vodapp/bdump/vodapp_mrp0_21304.trc:
ORA-00600: internal error code, arguments: [3689], [33], [], [], [], [], [], []
Errors with log /baceldba/archive/1_30374_730819916.dbf
MRP0: Background Media Recovery terminated with error 600  <--mrp进程异常
Fri Oct 19 12:01:16 2012
Errors in file /data/oracle/admin/vodapp/bdump/vodapp_mrp0_21304.trc:
ORA-00600: internal error code, arguments: [3689], [33], [], [], [], [], [], []
Some recovered datafiles maybe left media fuzzy
Media recovery may continue but open resetlogs may fail
Fri Oct 19 12:01:18 2012
Errors in file /data/oracle/admin/vodapp/bdump/vodapp_mrp0_21304.trc:
ORA-00600: internal error code, arguments: [3689], [33], [], [], [], [], [], []
Fri Oct 19 12:01:18 2012
Errors in file /data/oracle/admin/vodapp/bdump/vodapp_mrp0_21304.trc:
ORA-00600: internal error code, arguments: [3689], [33], [], [], [], [], [], []

通过观察日志发现,数据库的mrp0进程因为ora-600的错误导致异常终止,从而使得dg备库无法正常使用归档日志

分析trace文件得出

*** 2012-10-19 12:01:16.327
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [3689], [33], [], [], [], [], [], []
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp()+744         CALL     ksedst()             000000840 ?
                                                   FFFFFFFF7FFFA08C ?
                                                   000000000 ?
                                                   FFFFFFFF7FFF6B80 ?
                                                   FFFFFFFF7FFF58E8 ?
                                                   FFFFFFFF7FFF62E8 ?
kgeriv()+220         PTR_CALL 0000000000000000     000106000 ? 106323304 ?
                                                   106323000 ? 000106323 ?
                                                   000106000 ? 106323304 ?
kgesiv()+112         CALL     kgeriv()             10631DCD0 ? 000000000 ?
                                                   000000E69 ? 000000001 ?
                                                   FFFFFFFF7FFFA5E8 ?
                                                   000001430 ?
ksesic1()+96         CALL     kgesiv()             10631DCD0 ? 10646AD00 ?
                                                   000000E69 ? 000000001 ?
                                                   FFFFFFFF7FFFA5E8 ?
                                                   000001420 ?
kcvadc1_10gr2()+265  CALL     ksesic1()            000000E69 ? 00010631D ?
2                                                  1063232F8 ? 000106000 ?
                                                   106323000 ? 000106323 ?
kcoapl()+5664        PTR_CALL 0000000000000000     FFFFFFFF7A5570F8 ?
                                                   FFFFFFFF7FFFB320 ?
                                                   FFFFFFFF76667E18 ?
                                                   000000000 ? 000000021 ?
                                                   000000020 ?
kcramr()+5352        CALL     kcoapl()             FFFFFFFF76667E00 ?
                                                   000000003 ? 1055136B0 ?
                                                   105513340 ? 00000001B ?
                                                   000000017 ?
krddmr()+2688        CALL     kcramr()             000000000 ?
                                                   FFFFFFFF7A55FF78 ?
                                                   000002000 ? 000106000 ?
                                                   0000001F6 ? 464897868 ?
krsmdp()+2332        CALL     krddmr()             105502000 ?
                                                   FFFFFFFF7A5570F8 ?
                                                   000000001 ? 10646C760 ?
                                                   464897868 ? 000000000 ?
ksbrdp()+896         PTR_CALL 0000000000000000     000380000 ? 000380000 ?
                                                   105517000 ? 000106328 ?
                                                   000106000 ? 00000FFFF ?
opirip()+824         CALL     ksbrdp()             000106320 ? 380007734 ?
                                                   000380000 ? 00038000E ?
                                                   000106000 ? 1005DCF80 ?
opidrv()+1200        CALL     opirip()             10632A000 ? 000106000 ?
                                                   000106332 ? 380007000 ?
                                                   00010632A ? 1063D73A0 ?
sou2o()+80           CALL     opidrv()             10632D460 ? 000000001 ?
                                                   000000032 ? 000000000 ?
                                                   000000032 ? 000106000 ?
opimai_real()+268    CALL     sou2o()              FFFFFFFF7FFFF968 ?
                                                   000000032 ? 000000004 ?
                                                   FFFFFFFF7FFFF990 ?
                                                   105C1F000 ? 000105C1F ?
main()+152           CALL     opimai_real()        000000003 ?
                                                   FFFFFFFF7FFFFA68 ?
                                                   000000000 ? 000000000 ?
                                                   0023F9764 ? 000014400 ?
_start()+380         CALL     main()               000000001 ?
                                                   FFFFFFFF7FFFFB78 ?
                                                   000000000 ?
                                                   FFFFFFFF7FFFFA70 ?
                                                   FFFFFFFF7FFFFB80 ?
                                                   FFFFFFFF7CF00200 ?
--------------------- Binary Stack Dump ---------------------

查询mos发现
Bug 5230162 : ORA-600[3689][42] ON PHYSICAL STANDBY PREVENT MEDIA RECOVERY描述相符

处理建议
1.10.2.0.1数据库不稳定,bug较多,建议升级数据库到10.2.0.4或者10.2.0.5
2.当前standby_file_management为true,建议修改为false,每次手工两边增加数据文件
3.主库添加数据文件后,观察备库的mrp进程是否正常,如果异常
1)尝试启动mrp进程,如果因为新增加的数据文件导致异常,那先使用rman或者其他方法拷贝主库新建立数据文件到备库,然后启动mrp进程
2)如果因为某种原因导致归档日志出现gap,可以从备份中找出归档日志,如果备份中没有,那可以使用基于scn的备份方式来修复该故障。
3)如果scn相差较大,建议直接重建备库

expdp遭遇ORA-39006/ORA-39213故障解决

expdp导出数据遇到ORA-39006/ORA-39213错误,通过执行执行dbms_metadata_util.load_stylesheets解决
expdp工作异常

--导出awr信息
SQL> @?/rdbms/admin/awrextr.sql
…………
Exception encountered in AWR_EXTRACT
ORA-39006: internal error
ORA-39213: Metadata processing is not available
begin
*
ERROR at line 1:
ORA-31623: a job is not attached to this session via the specified handle
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79
ORA-06512: at "SYS.DBMS_DATAPUMP", line 911
ORA-06512: at "SYS.DBMS_DATAPUMP", line 4710
ORA-06512: at "SYS.DBMS_SWRF_INTERNAL", line 656
ORA-06512: at "SYS.DBMS_SWRF_INTERNAL", line 962
ORA-06512: at line 3
--导出一个表
$ expdp "'/ as sysdba'" dumpfile=xifenfei.dmp tables=scott.t_xifenfei
Export: Release 10.2.0.1.0 - 64bit Production on Wednesday, 31 October, 2012 13:03:20
Copyright (c) 2003, 2005, Oracle.  All rights reserved.
Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORA-39006: internal error
ORA-39213: Metadata processing is not available

错误提示

$ oerr ora 39006
39006, 00000, "internal error"
// *Cause:  An unexpected error occurred while processing a Data Pump job.
//          Subsequent messages supplied by DBMS_DATAPUMP.GET_STATUS
//          will further describe the error.
// *Action: Contact Oracle Customer Support.
$ oerr ora 39213
39213, 00000, "Metadata processing is not available"
// *Cause:  The Data Pump could not use the Metadata API.  Typically,
//          this is caused by the XSL stylesheets not being set up properly.
// *Action: Connect AS SYSDBA and execute dbms_metadata_util.load_stylesheets
//          to reload the stylesheets.

解决ORA-39006/ORA-39213问题

--查询数据库已经安装组件
SQL> col COMP_NAME for a35
SQL> select comp_name, version, status from dba_registry;
COMP_NAME                           VERSION                        STATUS
----------------------------------- ------------------------------ ----------------------
Oracle Database Catalog Views       10.2.0.1.0                     VALID
Oracle Database Packages and Types  10.2.0.1.0                     VALID
Oracle Workspace Manager            10.2.0.1.0                     VALID
JServer JAVA Virtual Machine        10.2.0.1.0                     VALID
Oracle XDK                          10.2.0.1.0                     VALID
Oracle Database Java Packages       10.2.0.1.0                     VALID
Oracle Expression Filter            10.2.0.1.0                     VALID
Oracle Data Mining                  10.2.0.1.0                     VALID
Oracle Text                         10.2.0.1.0                     VALID
Oracle XML Database                 10.2.0.1.0                     VALID
Oracle Rules Manager                10.2.0.1.0                     VALID
Oracle interMedia                   10.2.0.1.0                     VALID
OLAP Analytic Workspace             10.2.0.1.0                     VALID
Oracle OLAP API                     10.2.0.1.0                     VALID
OLAP Catalog                        10.2.0.1.0                     VALID
Spatial                             10.2.0.1.0                     VALID
Oracle Enterprise Manager           10.2.0.1.0                     VALID
17 rows selected.
--如果缺少下面组件,使用下面对应的程序安装
Oracle Database Catalog Views
Oracle Database Packages and Types
JServer JAVA Virtual Machine
Oracle XDK
Oracle Database Java Packages
--使用下面脚本安装(根据组件选择)
SQL> connect / as sysdba
SQL> @$ORACLE_HOME/javavm/install/initjvm.sql
SQL> connect / as sysdba
SQL> @$ORACLE_HOME/xdk/admin/initxml.sql
SQL> connect / as sysdba
SQL> @$ORACLE_HOME/rdbms/admin/catjava.sql
SQL> connect / as sysdba
SQL> @$ORACLE_HOME/rdbms/admin/utlrp.sql
--执行sys.dbms_metadata_util.load_stylesheets
SQL> execute sys.dbms_metadata_util.load_stylesheets;
PL/SQL procedure successfully completed.

测试expdp导出

$ expdp "'/ as sysdba'" dumpfile=xifenfei.dmp tables=scott.t_xifenfei  Directory=AWR_DIR
Export: Release 10.2.0.1.0 - 64bit Production on Wednesday, 31 October, 2012 14:18:04
Copyright (c) 2003, 2005, Oracle.  All rights reserved.
Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
Starting "SYS"."SYS_EXPORT_TABLE_01":  '/******** AS SYSDBA' dumpfile=xifenfei.dmp
tables=scott.t_xifenfei Directory=AWR_DIR
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 7 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "SCOTT"."T_XIFENFEI"                        5.374 MB   57376 rows
Master table "SYS"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYS.SYS_EXPORT_TABLE_01 is:
  /data/enmotech/xifenfei.dmp
Job "SYS"."SYS_EXPORT_TABLE_01" successfully completed at 14:18:11

测试证明,在不缺少相关组件的情况下,使用dbms_metadata_util.load_stylesheets可以解决expdp导出报ORA-39006/ORA-39213错误;如果缺少组件,需要先安装对应组件,然后再执行dbms_metadata_util.load_stylesheets解决该问题

dual 缺少同义词故障解决

在最近的一个客户案例中,因为缺少dual同义词,导致ddl语句无法执行。这里_system_trig_enabled参数和upgrade模式两种来解决该问题,整体上来说_system_trig_enabled不用重启数据库终止业务,更加人性化.
缺少dual同义词后果

SQL> create table t_xifenfei_dual as
  2  select * from dba_objects;
select * from dba_objects
              *
ERROR at line 2:
ORA-00604: error occurred at recursive SQL level 1
ORA-00942: table or view does not exist
SQL> alter session set events '942 trace name errorstack level 3';
Session altered.
SQL> create table t_xifenfei_dual as  select * from dba_objects;
create table t_xifenfei_dual as  select * from dba_objects
                                               *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-00942: table or view does not exist
--trace文件
*** 2012-09-29 12:37:05.156
ksedmp: internal or fatal error
ORA-00942: table or view does not exist
Current SQL statement for this session:
select dummy from dual where  ora_dict_obj_type = 'SYNONYM'
AND ora_dict_obj_owner = 'PUBLIC'
--dual 对象
SQL> select object_type,owner from dba_objects where object_name='DUAL';
OBJECT_TYPE         OWNER
------------------- ------------------------------
TABLE               SYS

尝试重建同义词

SQL> create public synonym dual for dual;
create public synonym dual for dual
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-00942: table or view does not exist

_system_trig_enabled参数

SQL> select a.ksppinm name,b.ksppstvl value,a.ksppdesc description
  2    from x$ksppi a,x$ksppcv b
  3   where a.inst_id = USERENV ('Instance')
  4     and b.inst_id = USERENV ('Instance')
  5     and a.indx = b.indx
  6     and upper(a.ksppinm) LIKE upper('%&param%')
  7  order by name
  8  /
Enter value for param: SYSTEM_TRIG_ENABLED
old   6:    and upper(a.ksppinm) LIKE upper('%&param%')
new   6:    and upper(a.ksppinm) LIKE upper('%SYSTEM_TRIG_ENABLED%')
NAME                             VALUE                    DESCRIPTION
-------------------------------- ------------------------ -----------------------------
_system_trig_enabled             TRUE                     are system triggers enabled

设置_SYSTEM_TRIG_ENABLED重建dual同义词

SQL> ALTER SYSTEM SET "_SYSTEM_TRIG_ENABLED"=FALSE SCOPE=MEMORY;
System altered.
SQL> create public synonym dual for dual;
Synonym created.
SQL> ALTER SYSTEM SET "_SYSTEM_TRIG_ENABLED"=true SCOPE=MEMORY;
System altered.

使用upgrade模式创建

SQL> drop PUBLIC SYNONYM dual;
Synonym dropped.
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             109054428 bytes
Database Buffers          201326592 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> create public synonym dual for dual;
create public synonym dual for dual
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-00942: table or view does not exist
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup upgrage;
SP2-0714: invalid combination of STARTUP options
SQL> startup upgrape;
SP2-0714: invalid combination of STARTUP options
SQL> startup upgrade
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             109054428 bytes
Database Buffers          201326592 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> create public synonym dual for dual;
Synonym created.
SQL> startup force
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             109054428 bytes
Database Buffers          201326592 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.

类此9430ms (rw) file: kct.c line: 7800 count: 13 total: 26874ms time: 1296124原因分析

ckpt进程的trace文件中出现类似:9430ms (rw) file: kct.c line: 7800 count: 13 total: 26874ms time: 1296124
ckpt进程的trace文件中报如下错误

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
ORACLE_HOME = /oradb/11.2.0/db
System name:	AIX
Node name:	offondb2
Release:	1
Version:	6
Machine:	00CC83664C00
Instance name: offon2
Redo thread mounted by this instance: 2
Oracle process number: 20
Unix process pid: 19660878, image: oracle@offondb2 (CKPT)
*** 2012-10-26 13:25:00.971
  1: 9430ms (rw) file: kct.c line: 7800 count: 13 total: 26874ms time: 1296124
  2: 1777ms (rw) file: krse.c line: 1800 count: 164 total: 122140ms time: 364176
  3: 1450ms (rw) file: kct.c line: 1011 count: 1 total: 1450ms time: 1594117
  4: 1230ms (rw) file: kcrf.c line: 10012 count: 135 total: 90995ms time: 2630737
  5: 1173ms (ro) file: kcrr.c line: 3525 count: 13 total: 8842ms time: 3645916
  6: 890ms (rw) file: kcrf.c line: 9959 count: 8 total: 4812ms time: 3578222
  7: 830ms (ro) file: kcf.c line: 5306 count: 1 total: 830ms time: 1594116
  8: 677ms (rw) file: kcv.c line: 11783 count: 8 total: 4499ms time: 416869
Control file enqueue hold time tracking dump at time: 2092019
*** 2012-10-26 15:25:14.789
  1: 9430ms (rw) file: kct.c line: 7800 count: 13 total: 26874ms time: 1296124
  2: 1777ms (rw) file: krse.c line: 1800 count: 165 total: 122832ms time: 364176
  3: 1450ms (rw) file: kct.c line: 1011 count: 1 total: 1450ms time: 1594117
  4: 1230ms (rw) file: kcrf.c line: 10012 count: 135 total: 90995ms time: 2630737
  5: 1173ms (ro) file: kcrr.c line: 3525 count: 13 total: 8842ms time: 3645916
  6: 890ms (rw) file: kcrf.c line: 9959 count: 8 total: 4812ms time: 3578222
  7: 830ms (ro) file: kcf.c line: 5306 count: 1 total: 830ms time: 1594116
  8: 677ms (rw) file: kcv.c line: 11783 count: 8 total: 4499ms time: 416869

原因分析并解决

New controlfile enqueue hold time tracking statistics have been added in 11.2 to
aid diagnosis of controlfile transaction related performance related issues.
To disable the trace set _controlfile_enqueue_holding_time_tracking_size to 0:
- spfile
alter system set "_controlfile_enqueue_holding_time_tracking_size"=0 scope=spfile;
- pfile
_controlfile_enqueue_holding_time_tracking_size=0
A database restart is required.

查询_controlfile_enqueue_holding_time_tracking_size

SQL> col name for a32
SQL> col value for a24
SQL> col description for a70
SQL> set linesize 150
SQL> select a.ksppinm name,b.ksppstvl value,a.ksppdesc description
  2    from x$ksppi a,x$ksppcv b
  3   where a.inst_id = USERENV ('Instance')
  4     and b.inst_id = USERENV ('Instance')
  5     and a.indx = b.indx
  6     and upper(a.ksppinm) LIKE upper('%&param%')
  7  order by name
  8  /
Enter value for param: _controlfile_enqueue_holding_time_tracking_size
old   6:    and upper(a.ksppinm) LIKE upper('%&param%')
new   6:    and upper(a.ksppinm) LIKE upper('%_controlfile_enqueue_holding_time_tracking_size%')
</strong>
NAME                             VALUE      DESCRIPTION
-------------------------------- ---------- ------------------------------------------------
_controlfile_enqueue_holding_tim 10         control file enqueue holding time tracking size
e_tracking_size

补充MOS说明[791417.1]

New controlfile enqueue hold time tracking statistics have been added in 11.2 to aid
diagnosis of controlfile transaction related performance related issues:
Control File Enqueue AWR Statistics:
max cf enq hold time - The maximum amount of time in milliseconds a client has held the control file enqueue.
total cf enq hold time - The total amount of time in milliseconds all clients have held the control file enqueue.
total number of cf enq holders - The total number of times clients have held the control file enqueue.
Periodically, the CKPT process dumps statistics for the top N control file enqueue holders.
N defaults to 10, but can be modified with the static hidden
parameter: _controlfile_enqueue_holding_time_tracking_size.The dump looks like the following:
Preface: "Control file enqueue hold time tracking dump at time: [relative time]".
a. Time the client has held the control file enqueue.
b. Type of client's control file enqueue transaction - rw or ro.
c. File name where the client obtained control file enqueue.
d. Line number where the client obtained control file enqueue.
e. Number of times the client has held the control file enqueue since it became a member of the top N.
f. Total time the client has held the control file in all those times from [e].
g. Relative time the client obtained the control file enqueue from [a].

rman备份出现ORA-19625/ORA-27054解决

RAC环境NFS挂载归档日志使用rman备份出现ORA-19625/ORA-27054错误分析
系统运行环境

OS:AIX 6100-06
DB:11.1.0.6.0 RAC
归档:挂载NFS

rman执行archive log时候报错

sql statement: alter system archive log  current
Starting backup at 25-OCT-11
current log archived
released channel: c1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 02/17/2012 13:03:51
RMAN-06059: expected archived log not found, lost of archived log compromises recoverability
ORA-19625: error identifying file /rarchlogA/1_13775_764866137.dbf
ORA-27054: NFS file system where the file is created or resides is not mounted with correct options
Additional information: 6

这里由于RAC存放归档使用了NFS文件系统,在使用rman备份归档日志执行alter system archive log current的时候发生如下错误.

相关目录挂载情况

$ df -g
Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
…………
/dev/lv_arch      50.00     43.24   14%      166     1% /arch1
oradb2:/arch2     50.00     35.31   30%      463     1% /arch2
/dev/baklv        90.00     83.82    7%       10     1% /backup
$ mount:
…………
        /dev/lv_oracle   /oracle          jfs2   Oct 14 11:27 rw,log=/dev8
         /dev/lv_arch     /arch1           jfs2   Oct 14 11:27 rw,log=/dev8
     oradb2   /arch2           /arch2           nfs3   Oct 14 11:47

通过这里可以知道,这里使用默认的参数挂载NFS,从而使得NFS在rman工作时候不能正常工作。

错误原因

From  Oracle 10G R2 , Oracle checks the options with which a NFS mount is mounted on the filesystem.
and this is done to ensure that no corruption of the database can happen as incorrectly
mounted NFS volumes can result in data corruption.
There are no single set of NFS mount options that work across all Oracle platforms
Please ensure that you have the proper mount options specified by the NAS vendor /Vendor user guide
The exact checks used for an NFS mounted disk vary between platforms but in general
the basic checks will include the following checks
a) The mount table (eg; /etc/mnttab) can be read to check the mount options
b) The NFS mount is mounted with the &quot;hard&quot; option
c) The mount options include rsize&gt;=32768 and wsize&gt;=32768
d) For RAC environments, where NFS disks are supported, the &quot;noac&quot; mount option is used.

解决方案
1.临时解决方案
As suggested in the bug the workaround recommended is to use the Event 10298.
alter system set events ‘10298 trace name context forever, level 32’;

2.永久解决方案
具体见:http://www.xifenfei.com/3269.html

ORACLE 11.2.0.3 生成awr html文件报SYS.DBMS_WORKLOAD_REPOSITORY异常

在想分析数据库性能的关键时刻,突然发现awr不能正常的工作,那就和你上了战场突然发现枪没有子弹一样的郁闷,今天就遇到了11.2.0.3在win的环境中awr生成html不能正常工作.通过查询mos发现该问题出现在各种平台中(win,linux,aix等),提醒大家注意该问题.
数据库版本

SQL> SELECT * FROM V$VERSION;
BANNER
-------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for 32-bit Windows: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

awr报错(html)

SQL> @?/rdbms/admin/awrrpt.sql
ORA-06502: PL/SQL: 数字或值错误 :  字符串缓冲区太小
ORA-06512: 在 "SYS.DBMS_WORKLOAD_REPOSITORY", line 919
ORA-06512: 在 line 1

设置errorstack

SQL> alter session set events '6502 trace name errorstack level 12';
会话已更改。

分析错误

----- Error Stack Dump -----
ORA-06502: PL/SQL: 数字或值错误 :  字符串缓冲区太小
----- Current SQL Statement for this session (sql_id=572fbaj0fdw2b) -----
select output from table(dbms_workload_repository.awr_report_html( :dbid,
                                                            :inst_num,
                                                            :bid, :eid,
                                                            :rpt_options ))
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
94348684       919  package body SYS.DBMS_WORKLOAD_REPOSITORY
983BAD54         1  anonymous block
----- Call Stack Trace -----
_skdstdst()+121      CALLrel  _kgdsdst()           19D99520 2
_ksedst1()+93        CALLrel  _skdstdst()          19D99520 0 1 485816 4863B2
                                                   485816
_ksedst()+49         CALLrel  _ksedst1()           0 1
_dbkedDefDump()+368  CALLrel  _ksedst()            0
6
_ksedmp()+44         CALLrel  _dbkedDefDump()      C 0
_dbkdaKsdActDriver(  CALLreg  00000000             C
)+4209
…………

通过查询mos发现Bug 13575143一致,可以确定是该bug,但是通过进一步测试证明不光是awrrpt会出现该错误,awr的相关报告中,只要是展示html结果的都有可能出现类此错误(比如awrrpti.sql/awrddrpt.sql/awrddrpi.sql等等).同时这里通过进一步分析发现其实该bug的起源是Bug 6458801(REPLACE on a CLOB can corrupt multibyte data ID 6458801.8),不过该bug说明已经在11.2.0.1中修复,其实通过这里的分析发现并没有真正的在11.2.0.3中修复该bug,针对该问题没有官方没有提供较好解决方法,只能是用过WORKAROUND来临时解决

They are able to generate the AWR report in the .txt format