模拟普通ORA-08103并解决

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:模拟普通ORA-08103并解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在上一篇中说到:模拟极端ORA-08103并解决,不能通过修改成坏块来解决,这里演示了是一个普通的数据块出现异常,然后通过bbed修改为坏块通过dbms_repair来解决该故障,补充说明:在11.2.0.3.3的库中,使用该方法不能重现该错误,而是直接提示ORA-01578,证明ORACLE的新版本在这一方面进行了改进
创建测试表

SQL> SELECT * FROM V$VERSION;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL> create table t_xifenfei as
  2   select * from dba_objects where rownum<3000;
Table created.
SQL> SELECT owner, segment_name, EXTENT_ID, FILE_ID, BLOCK_ID, BLOCKS
  2  FROM dba_extents
  3  WHERE segment_name='T_XIFENFEI' AND owner='CHF';
OWNER                          SEGMENT_NAME     EXTENT_ID    FILE_ID   BLOCK_ID     BLOCKS
------------------------------ --------------- ---------- ---------- ---------- ----------
CHF                            T_XIFENFEI               0          4         57          8
CHF                            T_XIFENFEI               1          4         65          8
CHF                            T_XIFENFEI               2          4         73          8
CHF                            T_XIFENFEI               3          4         81          8
CHF                            T_XIFENFEI               4          4         89          8
CHF                            T_XIFENFEI               5          4         97          8
6 rows selected.
SQL> SELECT DISTINCT dbms_rowid.rowid_block_number(rowid) blk#,
  2  dbms_rowid.rowid_relative_fno(rowid) file#
  3   FROM t_xifenfei
  4   ORDER BY 2,1;
      BLK#      FILE#
---------- ----------
        60          4
        61          4
        62          4
        63          4
        64          4
        65          4
        66          4
        67          4
        68          4
        69          4
        70          4
      BLK#      FILE#
---------- ----------
        71          4
        72          4
        74          4
        75          4
        76          4
        77          4
        78          4
        79          4
        80          4
        81          4
        82          4
      BLK#      FILE#
---------- ----------
        83          4
        84          4
        85          4
        86          4
        87          4
        88          4
        90          4
        91          4
        92          4
        93          4
        94          4
      BLK#      FILE#
---------- ----------
        95          4
        96          4
        97          4
        98          4
37 rows selected.

模拟ORA-08103

SQL> CONN / AS SYSDBA
Connected.
SQL> SHUTDOWN IMMEDIATE
Database closed.
Database dismounted.
ORACLE instance shut down.
--破坏数据快(其实就是清空一个数据块block 95,注意dd和实际数据的block对应关系相差1)
[oracle@xifenfei ~]$ dd if=/dev/zero of=/u01/oracle/oradata/XFF/users01.dbf bs=8192 seek=95 count=1 conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000187113 seconds, 43.8 MB/s
SQL> STARTUP
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             104860124 bytes
Database Buffers          205520896 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> SELECT COUNT(*) FROM CHF.T_XIFENFEI;
SELECT COUNT(*) FROM CHF.T_XIFENFEI
                         *
ERROR at line 1:
ORA-08103: object no longer exists
[oracle@xifenfei ~]$ exp chf/xifenfei tables=t_xifenfei file=/tmp/t_xifenfei.dmp
Export: Release 10.2.0.4.0 - Production on Fri Jan 13 22:09:43 2012
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in ZHS16GBK character set and AL16UTF16 NCHAR character set
About to export specified tables via Conventional Path ...
. . exporting table                     T_XIFENFEI
EXP-00056: ORACLE error 8103 encountered
ORA-08103: object no longer exists
Export terminated successfully with warnings.
[oracle@xifenfei ~]$ expdp chf/xifenfei tables=t_xifenfei dumpfile=t_xifenfei.dmp
Export: Release 10.2.0.4.0 - Production on Friday, 13 January, 2012 22:10:26
Copyright (c) 2003, 2007, Oracle.  All rights reserved.
Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "CHF"."SYS_EXPORT_TABLE_01":  chf/******** tables=t_xifenfei dumpfile=t_xifenfei.dmp
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 384 KB
Processing object type TABLE_EXPORT/TABLE/TABLE
ORA-31693: Table data object "CHF"."T_XIFENFEI" failed to load/unload and is being skipped due to error:
ORA-02354: error in exporting/importing data
ORA-08103: object no longer exists
Master table "CHF"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for CHF.SYS_EXPORT_TABLE_01 is:
  /u01/oracle/oracle/product/10.2.0/db_1/rdbms/log/t_xifenfei.dmp
Job "CHF"."SYS_EXPORT_TABLE_01" completed with 1 error(s) at 22:10:50

测试证明当出现ORA-08103的时候,全表扫描,exp,expdp均不能正常工作

找到出现ORA-08103数据块

SQL> alter session set max_dump_file_size=unlimited;
Session altered.
SQL> alter session set db_file_multiblock_read_count=1;
Session altered.
SQL> alter session set events 'immediate trace name trace_buffer_on level 1048576';
Session altered.
SQL>  alter session set events '10200 trace name context forever, level 1';
Session altered.
SQL> alter session set events '8103 trace name errorstack level 3';
Session altered.
SQL> alter session set events '10236 trace name context forever, level 1';
Session altered.
SQL>  alter session set tracefile_identifier='ORA8103';
Session altered.
SQL> select * from chf.t_xifenfei;
……………………
ERROR:
ORA-08103: object no longer exists
2700 rows selected.
--在trace文件结尾发现如下记录,表示读到这个数据块时发生错误
KTRVAC: path typ=0, rdba=100005f
SQL> select to_number('100005f','xxxxxxxxxxxxx') from dual;
TO_NUMBER('100005F','XXXXXXXXXXXXX')
------------------------------------
                            16777311
SQL> select
  2  dbms_utility.data_block_address_file(16777311) FILE_NO,
  3  dbms_utility.data_block_address_block(16777311) BLOCK_NO
  4  from dual;
   FILE_NO   BLOCK_NO
---------- ----------
         4         95

bbed继续破坏异常块

BBED> set filename '/u01/oracle/oradata/XFF/users01.dbf'
        FILENAME        /u01/oracle/oradata/XFF/users01.dbf
BBED> set block 95
        BLOCK#          95
BBED> map
 File: /u01/oracle/oradata/XFF/users01.dbf (0)
 Block: 95                                    Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (00)
BBED> set count 32
        COUNT           32
BBED> d
 File: /u01/oracle/oradata/XFF/users01.dbf (0)
 Block: 95               Offsets:    0 to   31           Dba:0x00000000
------------------------------------------------------------------------
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
--这个就是和11gr2的区别,在11g中使用该方法来模拟ORA-08103,直接提示坏块,从而不会出现ORA-08103
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/XFF/users01.dbf
BLOCK = 95
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 1
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
BBED> d offset 8180
 File: /u01/oracle/oradata/XFF/users01.dbf (0)
 Block: 95               Offsets: 8180 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 00000000 00000000 00000000
 <32 bytes per line>
BBED> m /x 01010101 offset 8188
BBED-00215: editing not allowed in BROWSE mode
BBED> set mode edit
        MODE            Edit
--修改sumcheck
BBED> m /x 01010101 offset 8188
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/XFF/users01.dbf (0)
 Block: 95               Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 01010101
 <32 bytes per line>
BBED> sum
Check value for File 0, Block 95:
current = 0x0000, required = 0x0000

测试修改为坏块效果

SQL> select count(*) from chf.t_xifenfei;
select count(*) from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 95)
ORA-01110: data file 4: '/u01/oracle/oradata/XFF/users01.dbf'
SQL> exec dbms_repair.skip_corrupt_blocks('CHF','T_XIFENFEI');
PL/SQL procedure successfully completed.
SQL> select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
      2918

通过让ORA-08103对应的块变为真正的坏块,然后使用dbms_repair或者event来跳过坏块,达到拯救数据的目的

模拟极端ORA-08103并解决

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:模拟极端ORA-08103并解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

ORA-08103错误在数据库日程运维和异常恢复中都可能遇到,出现该错误的原因很多,有很多情况下(模拟普通ORA-08103并解决),直接通过修改块使其在数据库查询的时候表标志为坏块,然后使用event或者dbms_repair包来标志该块,然后跳过就可以解决该问题,但是有些时候,遇到极端情况,该方法会失效,需要借助极端工具来处理该极端问题.
分析表相关EXTENT

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> create table t_xifenfei  as
  2   select * from dba_objects;
Table created.
SQL> select count(*) from t_xifenfei;
  COUNT(*)
----------
     74504
SQL> select data_object_id,object_id from dba_objects where object_name='T_XIFENFEI';
DATA_OBJECT_ID  OBJECT_ID
-------------- ----------
         75592      75592
SQL> select file_id,block_id,block_id+blocks-1
  2      from dba_extents
  3   where segment_name ='T_XIFENFEI' AND owner='CHF';
   FILE_ID   BLOCK_ID BLOCK_ID+BLOCKS-1
---------- ---------- -----------------
         4        680               687
         4        688               695
         4        696               703
         4        704               711  <---注意
         4        712               719
         4        720               727
         4        728               735
         …………
24 rows selected.
SQL> Select segment_name,header_file,header_blocK
  2  from dba_segments where
  3  segment_name in ('T_XIFENFEI') and owner='CHF'
  4  ;
SEGMENT_NAME                   HEADER_FILE HEADER_BLOCK
------------------------------ ----------- ------------
T_XIFENFEI                               4          682
SQL> alter system checkpoint;
System altered.
SQL> alter system dump datafile 4 block 682;
System altered.
SQL> select value from v$diag_info where name='Default Trace File';
VALUE
--------------------------------------------------------------------------------
/u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_28933.trc
--trace文件
Dump of memory from 0xB6CFD600 to 0xB6CFF600
B6CFD600 0000A223 010002AA 000DB4EA 04010000  [#...............]
B6CFD610 00008ECF 00000000 00000000 00000000  [................]
B6CFD620 00000000 00000018 00000480 00000A9C  [................]
B6CFD630 00000017 0000003F 00000080 010007BF  [....?...........]
B6CFD640 00000000 00000017 00000000 0000043F  [............?...]
B6CFD650 00000000 00000000 00000000 00000017  [................]
B6CFD660 0000003F 00000080 010007BF 00000000  [?...............]
B6CFD670 00000017 00000000 0000043F 01000780  [........?.......]
B6CFD680 01000780 00000000 00000000 00000000  [................]
B6CFD690 00000000 00000000 00000000 00000000  [................]
        Repeat 3 times
B6CFD6D0 00000001 00002000 00000000 00001434  [..... ......4...]
B6CFD6E0 00000000 010002A9 00000001 01000781  [................]
B6CFD6F0 010002A9 00000000 00000000 00000000  [................]
B6CFD700 00000000 00000000 00000018 00000000  [................]
B6CFD710 00012748 10000000 010002A8 00000008  [H'..............]
B6CFD720 010002B0 00000008 010002B8 00000008  [................]
B6CFD730 010002C0 00000008 010002C8 00000008  [................]   <----dump中找到下面值
……
  Extent Map
  -----------------------------------------------------------------
   0x010002a8  length: 8
   0x010002b0  length: 8
   0x010002b8  length: 8
   0x010002c0  length: 8     <-----选择第四个exent
   0x010002c8  length: 8
   …………
  Auxillary Map
  --------------------------------------------------------
   Extent 0     :  L1 dba:  0x010002a8 Data dba:  0x010002ab
   Extent 1     :  L1 dba:  0x010002a8 Data dba:  0x010002b0
   Extent 2     :  L1 dba:  0x010002b8 Data dba:  0x010002b9
   Extent 3     :  L1 dba:  0x010002b8 Data dba:  0x010002c0  <---同上
   Extent 4     :  L1 dba:  0x010002c8 Data dba:  0x010002c9
   Extent 5     :  L1 dba:  0x010002c8 Data dba:  0x010002d0
   …………
  --------------------------------------------------------
--确定Extent 3的记录在file 4 block 682 的偏移量为304
------省略了相同部分B6CFD
SQL> SELECT TO_NUMBER('730','XXXXX') FROM DUAL;
TO_NUMBER('730','XXXXX')
------------------------
                    1840
SQL> SELECT TO_NUMBER('600','XXX') FROM DUAL;
TO_NUMBER('600','XXX')
----------------------
                  1536
SQL> SELECT 1840-1536 FROM DUAL;
 1840-1536
----------
       304
SQL> SHUTDOWN IMMEDIATE;
Database closed.
Database dismounted.
ORACLE instance shut down.

这里我们假设修改SEGMENT HEADER中关于EXTENT 3(从0开始计数)的映射地址,从而来使得该表在查询的时候出现ORA-08103错误

bbed修改相关值

[oracle@xifenfei ~]$ bbed parfile=bbed.par
BBED> SET MODE EDIT
        MODE            Edit
BBED> INFO
 File#  Name                                                        Size(blks)
 -----  ----                                                        ----------
     4  /u01/oracle/oradata/ora11g/users01.dbf                               0
BBED> SET FILE 4 BLOCK 682
        FILE#           4
        BLOCK#          682
BBED> D
 File: /u01/oracle/oradata/ora11g/users01.dbf (4)
 Block: 682              Offsets:    0 to  511           Dba:0x010002aa
------------------------------------------------------------------------
 23a20000 aa020001 eab40d00 00000104 cf8e0000 …………
 <32 bytes per line>
BBED> m /x 1100
 File: /u01/oracle/oradata/ora11g/users01.dbf (4)
 Block: 682              Offsets:    0 to  511           Dba:0x010002aa
------------------------------------------------------------------------
 11000000 aa020001 eab40d00 00000104 cf8e0000 …………
 <32 bytes per line>
BBED> d
 File: /u01/oracle/oradata/ora11g/users01.dbf (4)
 Block: 682              Offsets:    0 to  511           Dba:0x010002aa
------------------------------------------------------------------------
 11000000 aa020001 eab40d00 00000104 cf8e0000 …………
 <32 bytes per line>
BBED> sum apply
Check value for File 4, Block 682:
current = 0x2cfd, required = 0x2cfd

重现ORA-08103

SQL> startup
ORACLE instance started.
Total System Global Area  313860096 bytes
Fixed Size                  1344652 bytes
Variable Size             239078260 bytes
Database Buffers           67108864 bytes
Redo Buffers                6328320 bytes
Database mounted.
Database opened.
SQL> conn chf/xifenfei
Connected.
SQL> select count(*) from t_xifenfei;
select count(*) from t_xifenfei
                     *
ERROR at line 1:
ORA-08103: object no longer exists

定位坏块位置

SQL> alter session set max_dump_file_size=unlimited;
Session altered.
SQL> alter session set db_file_multiblock_read_count=1;
Session altered.
SQL> alter session set events 'immediate trace name trace_buffer_on level 1048576';
Session altered.
SQL> alter session set events '10200 trace name context forever, level 1';
Session altered.
SQL> alter session set events '8103 trace name errorstack level 3';
Session altered.
SQL> alter session set events '10236 trace name context forever, level 1';
Session altered.
SQL> alter session set tracefile_identifier='ORA8103';
Session altered.
SQL> select * from chf.t_xifenfei;
select * from chf.t_xifenfei
                  *
ERROR at line 1:
ORA-08103: object no longer exists
--trace文件关键内容
block_row_dump:
tab 0, row 0, @0x1f70
tl: 48 fb: --H-FL-- lb: 0x0  cc: 3
col  0: [ 9]  44 49 43 54 2e 42 41 53 45
col  1: [ 1]  32
col  2: [32]
…………
end_of_block_dump
The buffer with tsn: 0 rdba: 0x00400321 has already been dumped
The buffer with tsn: 4 rdba: 0x010002aa was pinned,
but could not be dumped
SQL> Select to_number('010002aa','xxxxxxxxxxxxxxxxxx') from dual;
TO_NUMBER('010002AA','XXXXXXXXXXXXXXXXXX')
------------------------------------------
                                  16777898
SQL> select
  2  dbms_utility.data_block_address_file(16777898) FILE_NO,
  3  dbms_utility.data_block_address_block(16777898) BLOCK_NO
  4  from dual;
   FILE_NO   BLOCK_NO
---------- ----------
         4        682

检查坏块

[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY: Release 11.2.0.3.0 - Production on Fri Jan 13 18:03:13 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/users01.dbf
Block Checking: DBA = 16777898, Block Type = Unlimited data segment header with flg blks
Incorrect total map count: 24
Page 682 failed with check code 17006
DBVERIFY - Verification complete
Total Pages Examined         : 2240
Total Pages Processed (Data) : 1421
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 199
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 229
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 391
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 898278 (0.898278)
--对应块
SQL> select
  2  dbms_utility.data_block_address_file(16777898) FILE_NO,
  3  dbms_utility.data_block_address_block(16777898) BLOCK_NO
  4  from dual;
   FILE_NO   BLOCK_NO
---------- ----------
         4        682

通过这里dbv检查发现,该数据库已经是坏块了,也就是说,网上流传的人工标志为坏块的方法在此处行不通,这里我们考虑使用dul等类此工具挖取数据,来拯救数据

dul和odu常规操作

DUL> unload table chf.t_xifenfei;
. unloading table                T_XIFENFEI
DUL: Warning: Using data objno from segment header( 16779199) instead of expected id (75592) from dictionary
DUL: Error: No entry in control file for block: ts# = 4 rfile# = 0 block# = 1087
DUL: Error: While processing unknown file block# 1087
DUL: Error: Could not read/parse data block
ODU> unload table chf.t_xifenfei
Unloading table: T_XIFENFEI,object ID: 75592
Unloading segment,storage(Obj#=75592 DataObj#=75592 TS#=4 File#=4 Block#=682 Cluster=0)
corrupted block
0 rows unloaded

通过上面的测试证明,在该中情况下(SEGMENT HEADER)出现异常时,dul/odu均不能使用最常规的方法挖的数据

扫描数据文件方式挖
因为odu在这个方面的操作人性化于dul,所以只用odu进行相关测试,dul肯定能过实现相同功能

ODU> SCAN EXTENT TABLESPACE 4 PARALLEL 2
scan extent start: 2012-9-23 2:47:51
scanning extent...
scanning extent finished.
scan extent completed: 2012-9-23 2:47:51
ODU> unload table chf.t_xifenfei object  75592
Unloading table: T_XIFENFEI,object ID: 75592
Unloading segment,storage(Obj#=75592 DataObj#=75592 TS#=4 File#=4 Block#=682 Cluster=0)
74504 rows unloaded

试验证明通过odu扫描表空间/数据文件找回来所有的数据,然后truncate掉问题表,导入该数据库,问题可以得意顺利解决

ORA-30013导致RAC 节点down掉

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-30013导致RAC 节点down掉

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天一朋友让我帮忙分析他们的9.2.0.2 rac 节点2异常down掉原因,相关信息如下:
前提信息

OS:HP-UX B.11.31
DB:9.2.0.2.0 RAC

节点2alert日志信息

Fri Sep  7 13:13:49 2012
ARC0: Completed archiving  log 11 thread 2 sequence 11651
Fri Sep  7 13:31:56 2012
Errors in file /oracle/admin/agent/udump/agent2_ora_797.trc:
ORA-00600: internal error code, arguments: [kgavsd_3], [0], [], [], [], [], [], []
ORA-00028: your session has been killed
Fri Sep  7 13:31:58 2012
Errors in file /oracle/admin/agent/bdump/agent2_pmon_5938.trc:
ORA-30013: undo tablespace 'UNDOTBS1' is currently in use
Fri Sep  7 13:31:58 2012
PMON: terminating instance due to error 30013
Fri Sep  7 13:31:58 2012
Errors in file /oracle/admin/agent/bdump/agent2_lms7_6033.trc:
ORA-30013: undo tablespace '' is currently in use
Fri Sep  7 13:31:58 2012
…………
Errors in file /oracle/admin/agent/bdump/agent2_lms0_6027.trc:
ORA-30013: undo tablespace '' is currently in use
Fri Sep  7 13:31:58 2012
System state dump is made for local instance
Fri Sep  7 13:32:03 2012
Instance terminated by PMON, pid = 5938
Fri Sep  7 14:34:35 2012

这里可以看到因为ORA-30013的错误使得pmon进程异常,从而使得该rac的节点2 down掉.同时这里还发现了ORA-00600[kgavsd_3]错误,是否是因为该ORA-600导致了数据库异常down还是一个偶然机会,我们继续分析

查看ORA-600[kgavsd_3]相关trace文件

*********START PLSQL RUNTIME DUMP************
***Got ORA-28 while running PLSQL***
***********END PLSQL RUNTIME DUMP************
*** 2012-09-07 13:31:56.740
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kgavsd_3], [0], [], [], [], [], [], []
ORA-00028: your session has been killed
Current SQL statement for this session:
--用户补档
DECLARE
  OUT_ERR_CODE NUMBER;
  OUT_ERR_MSG  VARCHAR2(1000);
  V_COUNT NUMBER;
BEGIN
    WHILE TRUE LOOP
      SELECT COUNT(*) INTO V_COUNT FROM amc_stat_log  where proc_name in('pRunOdsChannelWareData')
       and run_param=201208 AND STATE='A';
      IF V_COUNT>0 THEN
         dbms_output.put_line('exit loop '|| sysdate);
         EXIT;
      END IF;
      sys.Dbms_Lock.sleep(600);
      dbms_output.put_line('wake up '|| sysdate);
    END LOOP;
   PKG_AME_ODS_DATA.P_Add_TO_AgentServ(201208,OUT_ERR_CODE,OUT_ERR_MSG);
   PKG_AME_ODS_DATA.P_Update_Serv_Ware_ID(201208, OUT_ERR_CODE, OUT_ERR_MSG);
   PKG_AMS_SETTLE.P_COMMISION_51PRE_FLAG(201208,OUT_ERR_CODE, OUT_ERR_MSG);
END;
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
c000000e84ec59d0        14  anonymous block
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
Cannot find symbol in .
Cannot find symbol in .
Cannot find symbol in .
ksedmp()+512         call     9fffffffffff3940     000000000 ?
                                                   C000000000000A17 ?
                                                   40000000025B6540 ?
ksfdmp()+64          call     9fffffffffff3940     000000003 ?
kgerinv()+352        call     9fffffffffff3940     60000000000466B0 ?
                                                   000000003 ?
                                                   C000000000000714 ?
                                                   4000000004EBA6A0 ?
                                                   00001821B ?
                                                   6000000000468EA8 ?
kgesinv()+48         call     9fffffffffff3940     60000000000466B0 ?
                                                   600000000059BD98 ?
                                                   600000000046B070 ?
                                                   60000000000179C0 ?
                                                   6000000000017950 ?
kgesin()+112         call     9fffffffffff3940     60000000000466B0 ?
                                                   600000000059BD98 ?
                                                   4000000000B44C10 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFF4310 ?
$cold_kgavsd_stackl  call     9fffffffffff3940     60000000000466B0 ?
et_done()+1184                                     600000000059BD98 ?
                                                   4000000000B44C10 ?
                                                   000000001 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   00001FE19 ?
pbesd_stacklet_done  call     9fffffffffff3940     60000000000466B0 ?
()+96                                              000000002 ? 000000000 ?
                                                   9FFFFFFFBEC6AE68 ?
pfrrun()+3328        call     9fffffffffff3940     9FFFFFFFBEC6AE68 ?
                                                   C000000000001D42 ?
                                                   4000000001ABBDA0 ?
                                                   9FFFFFFFBEC6B06C ?
                                                   9FFFFFFFFFFF64F0 ?
                                                   6000000000468EA8 ?
                                                   000000000 ? 000000000 ?
peicnt()+480         call     0000000000000000     9FFFFFFFBEC6AEE8 ?
                                                   C000000000000996 ?
                                                   40000000019F3680 ?
kkxexe()+816         call     0000000000000000     9FFFFFFFFFFF64F0 ?
                                                   9FFFFFFFBEC6AE68 ?
                                                   C00000000000099B ?
                                                   4000000001DD90F0 ?
                                                   00000FE4F ?
                                                   9FFFFFFFFFFF5F00 ?
                                                   60000000000467F0 ?
                                                   4000000000B603F0 ?
opiexe()+11168       call     0000000000000000     000000000 ?
                                                   C000000000002C60 ?
                                                   4000000001BEF980 ?
                                                   00000800F ?
                                                   9FFFFFFFFFFF6470 ?
                                                   9FFFFFFFBEC6AEB2 ?
                                                   6000000000040900 ?
                                                   9FFFFFFFBEC6B534 ?
opiall0()+3184       call     _etext_f()+23058430  000000004 ? 0000000C4 ?
                              09110686928          9FFFFFFFFFFF7B40 ?
                                                   C000000000002BDF ?
                                                   4000000001B26CD0 ?
                                                   000000000 ? 00000C893 ?
                                                   9FFFFFFFFFFF6690 ?
Cannot find symbol in .
kpoal8()+2064        call     9fffffffffff7ad0     000000001 ?
                                                   9FFFFFFFFFFF8304 ?
                                                   FFFFFFFFBFFFFFFF ?
                                                   9FFFFFFFFFFF83E4 ?
                                                   FFFFFFFFFFE7FBDF ?
                                                   9FFFFFFFFFFF7B88 ?
                                                   000000000 ?
                                                   6000000000474528 ?
opiodr()+3584        call     9fffffffffff81fc     6000000000040950 ?
                                                   000000000 ? 000000000 ?
                                                   C000000000002C60 ?
                                                   4000000001C09FE0 ?
                                                   00000C50B ?
                                                   9FFFFFFFFFFF81F0 ?
                                                   9FFFFFFFFFFF81D8 ?
ttcpip()+3776        call     _etext_f()+23058430  00000005E ? 000000014 ?
                              09114957288          9FFFFFFFFFFFA5F0 ?
                                                   6000000000040918 ?
                                                   C000000000001ABD ?
                                                   4000000001AB3BA0 ?
                                                   000000000 ? 00000C59B ?
opitsk()+1872        call     9fffffffffffa200     6000000000049FC0 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFFA5F0 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFFA740 ?
                                                   9FFFFFFFFFFFA564 ?
                                                   9FFFFFFFBF780058 ?
                                                   000000000 ?
opiino()+3184        call     000000000000057b     000000000 ? 000000000 ?
                                                   C00000000000132B ?
                                                   4000000001F78730 ?
                                                   000008001 ?
opiodr()+3584        call     0000000000000000     6000000000548A38 ?
                                                   4000000000B606F0 ?
                                                   6000000000548A38 ?
                                                   C000000000002C60 ?
                                                   4000000001C09FE0 ?
                                                   00000A201 ?
                                                   9FFFFFFFFFFFBC90 ?
                                                   4000000000B606F0 ?
opidrv()+976         call     _etext_f()+23058430  00000003C ? 000000004 ?
                              09114957288          9FFFFFFFFFFFEFB0 ?
                                                   6000000000040918 ?
sou2o()+80           call     _etext_f()+23058430  000000004 ? 000000004 ?
                              09114957288          9FFFFFFFFFFFEFB0 ?
main()+352           call     _etext_f()+23058430  9FFFFFFFFFFFEFD0 ?
                              09114957288          9FFFFFFFFFFFEFD4 ?
                                                   60000000004744F0 ?
                                                   9FFFFFFFFFFFEFB0 ?
main_opd_entry()+80  call     _etext_f()+23058430  000000000 ?
                              09114957288          9FFFFFFFFFFFF498 ?
                                                   C000000000000004 ?
                                                   C00000000002BE30 ?
--------------------- Binary Stack Dump ---------------------
Process global information:
     process: c000000d6428c0c0, call: c000000e46e772a8, xact: 0000000000000000,
     curses: c000000d6437d020, usrses: c000000d6437d020
  ----------------------------------------
  SO: c000000d6428c0c0, type: 2, owner: 0000000000000000, flag: INIT/-/-/0x00
  (process) Oracle pid=282, calls cur/top: c000000e46e772a8/c000000e46e772a8, flag: (0) -
            int error: 28, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 0 0 0
              last post received-location: No post
              last process to post me: c000000d64234f18 1 6
              last post sent: 0 0 104
              last post sent-location: kglpsl: in loop
              last process posted by me: c000000d6428e900 23 0
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: c000000d62234ee0
    O/S info: user: oracle, term: UNKNOWN, ospid: 797
    OSD pid info: Unix process pid: 797, image: oracle@gzagent2 (TNS V1-V3)
    ----------------------------------------
    SO: c000000d6437d020, type: 4, owner: c000000d6428c0c0, flag: INIT/-/-/0x00
    (session) trans: 0000000000000000, creator: c000000d6428c0c0, flag: (41) USR/- BSY/-/-/-/KIL/-
              DID: 0000-0000-00000000, short-term DID: 0000-0000-00000000
              txn branch: 0000000000000000
              oct: 0, prv: 0, sql: c000000e76b03f10, psql: 0000000000000000, user: 31/CUSTOM
    O/S info: user: huangqianhai_lc, term: SVCTAG-D1MLV2X, ospid: 11124:11796, machine: WORKGROUP\SVCTAG-D1MLV2X
              program: plsqldev.exe
    application name: PL/SQL Developer, hash value=1190136663
    action name: 测试窗口 - 新建, hash value=3604520210
    last wait for 'null event' blocking sess=0x0 seq=142 wait_time=567341620
                =ea60, =0, =0
    temporary object counter: 0
      ----------------------------------------

通过这里可以看出来是因为pl/sql dev进行一个plsql的操作导致该错误发生,查询MOS[ID 403575.1]发现

Applies to:
Oracle Server - Enterprise Edition - Version: 9.2.0.7 and later   [Release: 9.2 and later ]
Information in this document applies to any platform.
***Checked for relevance on 27-Oct-2010***
Symptoms
The following errors appears in the alert log file :
Probe:read_pipe: receive failed, status 3
Probe:S:debug_loop: timeout. Action 1
*********START PLSQL RUNTIME DUMP************
***Got ORA-604 while running PLSQL***
***********END PLSQL RUNTIME DUMP************
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kgavsd_3], [0], [], [], [], [], [], []
ORA-00604: error occurred at recursive SQL level 2
Current SQL statement for this session:
begin :id := sys.dbms_transaction.local_transaction_id; end;
.
Cause
The ora-600 kgavsd_3 appears when calling kgavsd_stacklet_done function which is related to PLSQL DEBUG.
From traces, dbms_debug package is being used during trace generation.
The return code of 3 further indicates that the dbms_pipe message was interrupted.
Probably user cancelled a plsql program, so the appeared while trying to dump the stack
Solution
There is no data corruption over here.
The error appears to be due to abnormal termination of aPL/SQL Developer application while executing a PL/SQL block.
Changing the PL/SQL and/or the procedure code could help in avoiding this error message.
Hence, this error can be safely ignored.

查找trace文件确实发现有name=SYS.DBMS_DEBUG,进一步表明该错误是由于plsql dev工具使用debug模式运行上面的plsql而引起该错误的发生,但是因为mos中记录和错误不是完全的一致,所以不能十分确定是该错误导致数据库down掉

继续分析ORA-30013

Error:      ORA-30013  (ORA-30013)
Text:      undo tablespace '%s' is currently in use
---------------------------------------------------------------------------
Cause:    the specified undo tablespace is currently used by another instance.
Action:    Wait for the undo tablespace to become available or change to another name and reissue the statement.

这个说明是没有疑问的:因为2节点配置的当前undo是UNDOTBS2,而UNDOTBS1是1节点使用的,证明这里的undo确实发生了错误,继续查询mos发现Bug 3368552

Hdr: 3368552 9.2.0.3 RDBMS 9.2.0.3 RAC PRODID-5 PORTID-23
Abstract: RAC:  ORA-30013 WHEN INSTANCE 2 ATTEMPTS TO ACCESS UNDO TABLESPACE OF INSTANCE 1
*** 01/12/04 06:21 am ***
TAR:
----
3554549.995
PROBLEM:
--------
The RAC database has been stable, but experienced an instance termination due
to ORA-30031 error in the alert log (instance 2):
...
Tue Dec 23 03:01:46 2003
ARC1: Evaluating archive  log 4 thread 2 sequence 1116
ARC1: Beginning to archive log 4 thread 2 sequence 1116
Creating archive destination LOG_ARCHIVE_DEST_1:
'/oracle/oradata/VLDB/logs/archives/VLDBN2/VLDB_0000001116_0002.arc'
ARC1: Completed archiving  log 4 thread 2 sequence 1116
Tue Dec 23 08:14:09 2003
Errors in file /oracle/admin/VLDB/bdump/vldbn2_pmon_22860.trc:
ORA-30013: undo tablespace 'UNDOTBS1' is currently in use
Tue Dec 23 08:14:09 2003
PMON: terminating instance due to error 30013
Tue Dec 23 08:14:10 2003
System state dump is made for local instance
Tue Dec 23 08:14:12 2003
Trace dumping is performing id=[cdmp_20031223081410]
Tue Dec 23 08:14:14 2003
Instance terminated by PMON, pid = 22860
<eof>
Instance 1 alert log shows only the reconfiguration and the cdump info:
..
Tue Dec 23 03:54:13 2003
Errors in file /oracle/admin/VLDB/udump/vldbn1_ora_13564.trc:
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 4
Tue Dec 23 08:14:10 2003
Trace dumping is performing id=[cdmp_20031223081410]
Tue Dec 23 08:14:12 2003
Reconfiguration started
List of nodes: 0,
Global Resource Directory frozen
one node partition
Communication channels reestablished
...

因为在9.2.0.3的RAC中有着该bug,那么我们可以大胆猜测在9.2.0.2中应该存在该bug,那么结合上面的ORA-00600[kgavsd_3]错误,我们大概还原该事故的全部:
1.节点1 dml操作了程序中报错的plsql中要范围的部分表对象,但是未提交(或者正在执行)
2.节点2 有用户使用pl/sql dev去执行程序中的plsql,因为是debug模式执行,需要UNDOTBS1的块来构建cr,从而使得节点2去访问UNDOTBS1,引发了Bug 3368552 从而使得数据库直接kill掉该plsql dev会话,进而出现ORA-00600[kgavsd_3]错误和pmon进程异常使得节点2 down掉

dbca创建数据库报ORA-00443

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:dbca创建数据库报ORA-00443

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天早上一个朋友和我说他们RAC dbca创建库不成功提示ORA-00443错误,让我帮他们分析下是什么原因导致
提示错误如图

环境状况

OS:LINUX REDHAT x86_64  5.7
kernel:2.6.18-194.el5
memory:100G
CPU:ntel(R) Xeon(R) CPU E7- 8837  @ 2.67GHz   *  64
ORACLE:10.2.0.4

查看alert日志错误

Wed Sep  5 01:32:33 2012
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
PMON started with pid=2, OS id=17859
DIAG started with pid=7, OS id=17861
PSP0 started with pid=12, OS id=17863
LMON started with pid=17, OS id=17865
LMD0 started with pid=22, OS id=17867
MMAN started with pid=27, OS id=17869
DBW0 started with pid=32, OS id=17871
Wed Sep  5 01:32:33 2012
Errors in file /u01/app/oracle/admin/dtjcdb/bdump/dtjcdb1_ora_17873.trc:
ORA-00600: internal error code, arguments: [ksbmoveme4], [], [], [], [], [], [], []
ORA-27300: OS system dependent operation:run on node failed with status: 2
ORA-27301: OS failure message: No such file or directory
ORA-27302: failure occurred at: skgpmoveme:1
Wed Sep  5 01:32:34 2012
Trace dumping is performing id=[cdmp_20120905013234]
Wed Sep  5 01:32:34 2012
Process DBW1 died, see its trace file
USER: terminating instance due to error 443
Instance terminated by USER, pid = 17857

trace文件中内容

*** 2012-09-05 01:32:33.996
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [ksbmoveme4], [], [], [], [], [], [], []
ORA-27300: OS system dependent operation:run on node failed with status: 2
ORA-27301: OS failure message: No such file or directory
ORA-27302: failure occurred at: skgpmoveme:1
Current SQL information unavailable - no session.
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+31          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FFFC4ABA4C0 ? 7FFFC4ABA520 ?
                                                   7FFFC4ABA460 ? 000000000 ?
ksedmp()+610         call     ksedst()             000000000 ? 000000001 ?
                                                   7FFFC4ABA4C0 ? 7FFFC4ABA520 ?
                                                   7FFFC4ABA460 ? 000000000 ?
ksfdmp()+21          call     ksedmp()             000000003 ? 000000001 ?
                                                   7FFFC4ABA4C0 ? 7FFFC4ABA520 ?
                                                   7FFFC4ABA460 ? 000000000 ?
kgerinv()+161        call     ksfdmp()             000000003 ? 000000001 ?
                                                   7FFFC4ABA4C0 ? 7FFFC4ABA520 ?
                                                   7FFFC4ABA460 ? 000000000 ?
kgesinv()+33         call     kgerinv()            0068966E0 ? 000000000 ?
                                                   7FFFC4ABA520 ? 7FFFC4ABA460 ?
                                                   000000000 ? 000000000 ?
ksesin()+211         call     kgesinv()            0068966E0 ? 000000000 ?
                                                   7FFFC4ABA520 ? 7FFFC4ABA460 ?
                                                   000000000 ? 000000000 ?
ksbmoveme()+350      call     ksesin()             00533D5C8 ? 000000000 ?
                                                   006896FA3 ? 000000001 ?
                                                   000000001 ? 000000000 ?
ksosp_set_current()  call     ksbmoveme()          000000001 ? 000000000 ?
+117                                               006896FA3 ? 000000001 ?
                                                   000000001 ? 000000000 ?
kso_init()+161       call     ksosp_set_current()  151056D10 ? 000000000 ?
                                                   006896FA3 ? 000000001 ?
                                                   000000001 ? 000000000 ?
opirip()+523         call     kso_init()           151056D10 ? 000000000 ?
                                                   006896FA3 ? 000000001 ?
                                                   000000001 ? 000000000 ?
opidrv()+582         call     opirip()             000000032 ? 000000004 ?
                                                   7FFFC4ABC128 ? 000000001 ?
                                                   000000001 ? 000000000 ?
sou2o()+114          call     opidrv()             000000032 ? 000000004 ?
                                                   7FFFC4ABC128 ? 000000001 ?
                                                   000000001 ? 000000000 ?
opimai_real()+317    call     sou2o()              7FFFC4ABC100 ? 000000032 ?
                                                   000000004 ? 7FFFC4ABC128 ?
                                                   000000001 ? 000000000 ?
main()+116           call     opimai_real()        000000003 ? 7FFFC4ABC190 ?
                                                   000000004 ? 7FFFC4ABC128 ?
                                                   000000001 ? 000000000 ?
__libc_start_main()  call     main()               000000003 ? 7FFFC4ABC190 ?
+244                                               000000004 ? 7FFFC4ABC128 ?
                                                   000000001 ? 000000000 ?
_start()+41          call     __libc_start_main()  000723088 ? 000000001 ?
                                                   7FFFC4ABC2E8 ? 000000000 ?
                                                   000000001 ? 000000003 ?
--------------------- Binary Stack Dump ---------------------

通过查询MOS发现[ID 422908.1]有类此的错误提示,但是该提示说是因为系统重新增加了过多CPU导致数据库crashed掉并且出现 ORA-27300 ORA-27301 ORA-27302 错误.在该案例中,起始就是64c,根据经验在win的10.2.0.4中如果cpu超过32c也是在dbcd创建数据库2%的地方hang住,所以怀疑该错误也是由于cpu太多导致.

处理方法

To solve the problem:
1) apply patch:6471079
- or -
2) apply the 10.2.0.5 (when available)
- or -
3) upgrade to 11g

朋友打上patch:6471079,dbca正常建库

双机mount数据库出现ORA-00600[kccsbck_first]

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:双机mount数据库出现ORA-00600[kccsbck_first]

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一朋友的数据库在做数据库恢复的时候,数据库不能启动到mount状态,检查发现
alert日志错误如下

Mon Aug 27 10:00:18 2012
ALTER DATABASE   MOUNT
Mon Aug 27 10:00:23 2012
Errors in file /oracle/admin/wf2009/udump/orcl_ora_7042.trc:
ORA-00600: internal error code, arguments: [kccsbck_first], [1], [1208656276], [], [], [], [], []
Mon Aug 27 10:00:23 2012
ORA-600 signalled during: ALTER DATABASE   MOUNT...

查询mos发现解释

The ORA-600 [kccsbck_first] error occurs when Oracle detects that another instance
has this database already mounted. For some reason, Oracle already sees a thread
with a heartbeat. This could be the expected behaviour if running OPS. In such a
case the parallel_server parameter needs to be set. In cases where Parallel Server
is not linked in, this is not the expected behaviour.

在非集群环境中,当该数据库已经在一个节点启动,然后另外一个节点再尝试启动数据库可能出现该错误.
检查环境发现是使用roseha的双机环境,当关闭当前节点的数据库时候,另外一个节点认为oracle down掉了,然后自动在该节点去尝试启动数据库,而当前操作节点去尝试mount数据库的时候发生该错误,因为该数据库的另外一个节点已经mount了.出现这样的情况,和朋友的存储资源的配置也有不合理之处,在接管资源之前,应该先分析和处理存储的挂载情况,而不是不卸载这边,另外一遍直接挂载(也就是存储两边都挂载)

解决办法
这个问题的本质就是因为ha的两边都启动了数据库导致,关闭一边的roseha或者关闭主机就可以了
在做数据库恢复的时候,尽量排查掉其他因素的影响,比如:rac在一个节点上操作,ha关闭双机软件等

ORA-609 TNS-12537 and TNS-12547 in 11g Alert.log

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-609 TNS-12537 and TNS-12547 in 11g Alert.log

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库alert日志出现如下错误

Fatal NI connect error 12537, connecting to:
 (LOCAL=NO)
  VERSION INFORMATION:
        TNS for IBM/AIX RISC System/6000: Version 11.2.0.2.0 - Production
        TCP/IP NT Protocol Adapter for IBM/AIX RISC System/6000: Version 11.2.0.2.0 - Production
        Oracle Bequeath NT Protocol Adapter for IBM/AIX RISC System/6000: Version 11.2.0.2.0 - Production
  Time: 21-AUG-2012 09:50:15
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12537
TNS-12537: TNS:connection closed
    ns secondary err code: 12560
    nt main err code: 0
    nt secondary err code: 0
    nt OS err code: 0
opiodr aborting process unknown ospid (15204768) as a result of ORA-609

错误的原因

The ORA-609 error is thrown when a client connection of any kind failed to complete or
aborted the connection process before the connection/authentication process was complete.
Very often, this connection abort is due to a timeout.  Beginning with 10gR2, a default value for
inbound connect timeout has been set at 60 seconds.
This time limit is often inadequate for the entire connection process to complete.
We have also discovered that the ORA-609 occurs frequently in installations
where the database is monitored by DB Console and the Enterprise Manager agent (emagent).
After the DB Console is started and as a matter of routine, the emagent will repeatedly try to
connect to the target instances.
We can see frequent emagent connections in the listener.log without error.
However, on occasion it may have failed to complete the connection process at the database so an ORA-609 is thrown.
The emagent will simply retry the connection and may be successful on the subsequent try.
(Provided there is no real fault occurring at the listener or database).
This temporary failure to connect will not be reported back to DB Console and there will be no indication,
 except for the ORA-609, that a fault occurred.

出现这个问题的主要原因是因为从10.2开始inbound connect timeout默认为60 seconds,而在很多建立连接过程可能超过这个时间从而出现类此错误,常见的诱因是DB Console 和 Enterprise Manager agent (emagent). EM会重复的尝试连接到数据库。其过程中会偶尔的出现连接超时的问题,但是接下来会继续尝试,并获得成功。这种临时的失败不会导致EM报错而只会以ora-609的形式记录在alert log中.

处理方法

For that reason, we often recommend increasing the values for INBOUND_CONNECT_TIMEOUT at both listener
and server side sqlnet.ora file as a preventive measure.
If the problem  is due to connection timeouts, an increase in the following parameters should eliminate
or reduce the occurrence of the ORA-609s.
e.g.
Sqlnet.ora: SQLNET.INBOUND_CONNECT_TIMEOUT=180
Listener.ora: INBOUND_CONNECT_TIMEOUT_listener_name=120
These settings are in seconds.  Again, the default is 60.

问题跟踪方法

If the issue persists and inbound connect does not have any effect, the following steps are intended to
help locate  the client that may be causing the errors.
1)  Suppress the TNS errors in the alert.log by setting the following listener.ora file parameter:
DIAG_ADR_ENABLED_listener_name=OFF
This will cause the TNS errors to be posted to the ORACLE_HOME/network/log/sqlnet.log file that is
local to the database and
may yield useful information about the client's address.
For example, here's a snippet from a server side sqlnet.log where client address info was posted:
Production Time: 15-FEB-2010 07:15:01
Fatal NI connect error 12537, connecting to:
(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(Host=yourhost)(Port=1521))(CONNECT_DATA=(SID=PROD1DR)
(CID=(PROGRAM=sqlplus)(HOST=client_host)(USER=client))))
Observe the PROGRAM and HOST fields on the last line.  This is where the connection originated.
Be sure to match timestamps in the sqlnet.log with the timestamps of the alert.log errors.
Once you've located the offending client,
you can enable client tracing to try and determine the cause:
TRACE_LEVEL_CLIENT=16
TRACE_DIRECTORY_CLIENT=<dir location>
TRACE_TIMESTAMP_CLIENT=TRUE
DIAG_ADR_ENABLED=off   <<<<<11g or newer client requirement
If you need assistance with client or server tracing, please open an SR with Global Customer Support.
2)  Check the listener.log for client connections that were logged at timestamps that match the ORA-609
timestamps as they appear in the alert.log.
The client information is recorded in each listener.log entry.
Since this error occurs AFTER the listener has handled the connection,
do not expect to see errors in the listener.log.
Here's an example snippet of an incoming client connection that was posted to the listener.log:
20-JAN-2009 17:08:45 (CONNECT_DATA=(SID=orcl)(CID=
(PROGRAM=D:\oracle\product\10.1.0\Db_1\perl\5.6.1\bin\MSWin32-x86\perl.exe)(HOST=myclient)
Note that the exact timestamp, program name and client host will often be recorded.  Again,
once you've located the offending client,
enable tracing (see above) to try to capture the connection failure.
3)  Enable server side Oracle Net tracing and capture the TNS error along with the incoming connection.
Match the PID that accompanies the ORA-609 to the server trace label.  e.g.
ORA-609 : opiodr aborting process unknown ospid (4799_1)  *Note the PID
This PID would correspond to server trace labeled:  svr_4799.trc.  Check the server trace for either
TNS error (the 609 will not appear)
and try to locate the originating client address.  If assistance is needed for this investigation,
please open an SR with Oracle Support.
See below for instuctions on enabling Oracle Net server tracing.
The following details the discovery of the source of an ORA-609 for a real case:
The alert.log reports the following messages intermittently but frequently:
Mon Nov 16 22:39:22 2009
ORA-609 : opiodr aborting process unknown ospid (nnnn)
Enabled Oracle Net server tracing:
TRACE_LEVEL_SERVER=16
TRACE_DIRECTORY_SERVER=<dir location>
TRACE_TIMESTAMP_SERVER=TRUE
DIAG_ADR_ENABLED=off
Reloaded listener and wait for error to appear again.:
ORA-609 : opiodr aborting process unknown ospid (5233_1)
Note that the server trace file set that corresponded to this event was named svr_5233*.trc.
Of course the timestamps of the alert.log event and the server trace creation matched as well.
A review of the server trace showed only an EOF failure and the  TNS-12537 error:
Read unexpected EOF ERROR
nserror: nsres: id=0, op=68, ns=12537
In this particular case, there was no information about the client in the trace.
This is atypical for a server trace.
It may be that the client aborted before all the client information was posted to the file.
However, there was post in the listener.log f
or an emagent connection that was established at the same point in time.
Here's an excerpt from a listener.log entry where an emagent establishes a connection:
PROGRAM=D:\oracle\product\10.1.0\Db_1\bin\emagent.exe)
Checked the EM Agent traces and logs and discovered the following entry:
Fatal NI connect error 12547, connecting to:
(LOCAL=NO)
VERSION INFORMATION:
TNS for Solaris: Version 11.1.0.7.0 - Production
Oracle Bequeath NT Protocol Adapter for Solaris: Version 11.1.0.7.0 - Production
TCP/IP NT Protocol Adapter for Solaris: Version 11.1.0.7.0 - Production
Time: 16-NOV-2009 22:39:22
****Tracing to file: /backup/sid_traces/sqlnetlog/svr_5233.trc
Tns error struct:
ns main err code: 12547
TNS-12547: TNS:lost contact
ns secondary err code: 12560
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
****Note the name of the server trace which contains the PID:  svr_5233.trc
Also, the timestamp of the agent event matches the timestamp of the alert.log error.
Check the following locations for EM Agent traces. If working with support on this issue and
the EM Agent is suspected, upload ALL files under:
$ORACLE_HOME/sysman/log/emagent.trc < Single node agent trace location
$ORACLE_HOME/host/sysman/log/emagent.trc < RAC agent trace location
It was determined that in this case, the emagent was aborting the connection
before it was complete and then simply reconnecting
and succeeding on the subsequent try.  No errors were reported in the listener log or listener trace.
No errors were returned to the DB Console.
There was no apparent outage of any kind.  No action was taken to correct the ORA-609 in this case.
It was decided that the message was informational and completely benign.

参考文档:
ORA-609 TNS-12537 and TNS-12547 in 11g Alert.log (Doc ID 1116960.1)
Troubleshooting Guide ORA-609 : Opiodr aborting process unknown ospid (Doc ID 1121357.1)

创建控制文件遭遇ORA-00600[3753]故障解决

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:创建控制文件遭遇ORA-00600[3753]故障解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一位网友的数据库正常关闭,然后控制文件意外丢失,需要通过trace中的信息重建控制文件,但是在重建的过程中,出现ORA-00600[3753]错误,远程帮忙处理,记录处理过程如下
1.启动数据库至nomount状态,然后尝试noresetlogs模式重建控制文件

SQL>@XFF_NORESETLOGS_CTL.sql
CREATE CONTROLFILE REUSE DATABASE "ORA10G" NORESETLOGS  ARCHIVELOG
*
第 1 行出现错误:
ORA-01503: CREATE CONTROLFILE 失败
ORA-00600: 内部错误代码, 参数: [3753], [3], [2], [], [], [], [], []

2.检查alert日志

Tue Aug 07 20:40:47 2012
WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Tue Aug 07 20:40:48 2012
Errors in file d:\oracle\product\10.2.0\db_1\admin\ora10g\udump\ora10g_ora_11596.trc:
ORA-00600: 内部错误代码, 参数: [3753], [3], [2], [], [], [], [], []
Tue Aug 07 20:40:53 2012
Errors in file d:\oracle\product\10.2.0\db_1\admin\ora10g\udump\ora10g_ora_11596.trc:
ORA-00600: 内部错误代码, 参数: [3753], [3], [2], [], [], [], [], []
ORA-1503 signalled during: CREATE CONTROLFILE REUSE DATABASE "ORA10G" NORESETLOGS  ARCHIVELOG

3.分析trace文件

Tue Aug 07 20:40:48 2012
ORACLE V10.2.0.1.0 - Production vsnsta=0
vsnsql=14 vsnxtr=3
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
Windows NT Version V6.1 Service Pack 1
CPU                 : 2 - type 586, 2 Physical Cores
Process Affinity    : 0x00000000
Memory (Avail/Total): Ph:166M/1901M, Ph+PgF:619M/5536M, VA:812M/2047M
Instance name: ora10g
Redo thread mounted by this instance: 0 <none>
Oracle process number: 16
Windows thread id: 11596, image: ORACLE.EXE (SHAD)
*** SERVICE NAME:() 2012-08-07 20:40:48.413
*** SESSION ID:(158.7) 2012-08-07 20:40:48.413
*** 2012-08-07 20:40:48.413
ksedmp: internal or fatal error
ORA-00600: 内部错误代码, 参数: [3753], [3], [2], [], [], [], [], []
Current SQL statement for this session:
CREATE CONTROLFILE REUSE DATABASE "ORA10G" NORESETLOGS  ARCHIVELOG
…………
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
_ksedst+38           CALLrel  _ksedst1+0           0 1
_ksedmp+898          CALLrel  _ksedst+0            0
_ksfdmp+14           CALLrel  _ksedmp+0            3
603A816A             CALLreg  00000000             87CF110 3
603A83FF             CALLrel  603A80D8             87CF110 8191090 EA9 2 8CCC438
_ksesic2+59          CALLrel  _kgesiv+0            87CF110 8191090 EA9 2 8CCC438
                                                   EA9 2 8CCC438
__VInfreq__kctbce+1  CALLrel  _ksesic2+0           EA9 0 3 0 0 2 0
63
_kcfccfl+356         CALLrel  _kctbce+0            543414C 81DB8A8
_cdbdrv+1037         CALLrel  _kcfccfl+0           543414C 1 8CCD060 8CCD04C
                                                   19000 3
_opiexe+11999        CALLrel  _cdbdrv+0            1
_opiosq0+6088        CALLrel  _opiexe+0            4 0 8CCD894
_kpooprx+232         CALLrel  _opiosq0+0           3 E 8CCD9AC A4
_kpoal8+775          CALLrel  _kpooprx+0           8CCF6CC 8196414 A16 1 0 A4
_opiodr+1099         CALLreg  00000000             5E 17 8CCF6C8
60FEFF8D             CALLreg  00000000             5E 17 8CCF6C8 0
_opitsk+1017         CALL???  00000000
_opiino+1087         CALLrel  _opitsk+0            0 0
_opiodr+1099         CALLreg  00000000             3C 4 8CCFC60
_opidrv+819          CALLrel  _opiodr+0            3C 4 8CCFC60 0
_sou2o+45            CALLrel  _opidrv+0            3C 4 8CCFC60
_opimai_real+112     CALLrel  _sou2o+0             8CCFC54 3C 4 8CCFC60
_opimai+92           CALLrel  _opimai_real+0       2 8CCFC8C
_OracleThreadStart@  CALLrel  _opimai+0
4+708
__pRawDllMain+10931  CALLptr  00000000
2903
__pRawDllMain+12925  CALLreg  00000000
4809
__pRawDllMain+12925  CALLrel  __pRawDllMain+12925
4761                          4772
--------------------- Binary Stack Dump ---------------------
    ----------------------------------------
    SO: 4FB3DF5C, type: 4, owner: 4FA4CBFC, flag: INIT/-/-/0x00
    (session) sid: 158 trans: 4EBB8954, creator: 4FA4CBFC, flag: (41) USR/- BSY/-/-/-/-/-
              DID: 0000-0010-0000000A, short-term DID: 0000-0000-00000000
              txn branch: 00000000
              oct: 0, prv: 0, sql: 00000000, psql: 4F707298, user: 0/SYS
    O/S info: user: superv06-PC\superv06, term: SUPERV06-PC, ospid: 7788:11636, machine: WORKGROUP\SUPERV06-PC
              program: sqlplus.exe
    application name: sqlplus.exe, hash value=0
    last wait for 'log file sequential read' blocking sess=0x00000000 seq=31
    wait_time=159 seconds since wait started=0
                log#=0, block#=1, blocks=1
    Dumping Session Wait History
     for 'log file sequential read' count=1 wait_time=159
                log#=0, block#=1, blocks=1
     for 'log file sequential read' count=1 wait_time=502
                log#=0, block#=1, blocks=1
     for 'log file sequential read' count=1 wait_time=163
                log#=0, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=18840
                file#=ffffffff, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=254
                file#=ffffffff, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=7654
                file#=ffffffff, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=150
                file#=ffffffff, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=102
                file#=ffffffff, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=123
                file#=ffffffff, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=14010
                file#=ffffffff, block#=1, blocks=1

通过这里我们发现创建控制文件的进程在读取redo log的时候出现了等待比较多而且时间比较长,而对于ORA-00600[3753]错误互联网上没有任何更多的信息.通过对于创建控制文件时候因为使用noresetlogs的分析:这种模式下需要读取redo log,所以导致等待较多,从而出现ORA-00600[3753]错误使得创建控制文件失败.因为本库是shutdown immediate关闭,所以我们完全可以通过resetlogs模式来创建控制文件,从而避免读取redo log.

4.创建resetlogs控制文件

SQL>@XFF_RESETLOGS_CTL.sql
Control file created.

5.然后不完全恢复使用resetlogs open数据库

这次的处理我也没有什么经验可以借鉴,MOS和互联网上没有该错误的任何信息,解决这个问题关键凭的是自己对于noresetlogs和resetlogs的理解.对于数据库原理的理解,对解决一些陌生问题帮助很大;在学习ORACLE过程中注重对原理的理解和消化

重现ORA-600 4000异常

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:重现ORA-600 4000异常

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

对于数据库恢复感兴趣的人,一定对于ORA-600[4000]这个著名的错误记忆犹新,这里通过试验重现ORA-600[4000]
查询数据库obj$.con$记录

[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.1.0 - Production on Fri Nov 4 06:32:36 2011
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1219160 bytes
Variable Size              92276136 bytes
Database Buffers          218103808 bytes
Redo Buffers                7168000 bytes
Database mounted.
Database opened.
SQL> select OBJ#,dbms_rowid.rowid_relative_fno(rowid) rel_fno,
  2  dbms_rowid.rowid_block_number(rowid) block_num from obj$
  3  where name='CON$';
      OBJ#    REL_FNO  BLOCK_NUM
---------- ---------- ----------
        28          1        122
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

bbed修改数据块
根据催华的研究,当OBJ$中部分数据块中的csc和itl的scn都大于header scn的时候,数据库启动就会报ORA-600[4000]

[oracle@xifenfei ~]$ bbed listfile=list mode=edit password=blockedit
BBED: Release 2.0.0.0.0 - Limited Production on Fri Nov 4 06:47:09 2011
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> info
 File#  Name                                                        Size(blks)
 -----  ----                                                        ----------
     1  /u01/oracle/oradata/XFF/system01.dbf                                 0
     2  /u01/oracle/oradata/XFF/undotbs01.dbf                                0
     3  /u01/oracle/oradata/XFF/sysaux01.dbf                                 0
     4  /u01/oracle/oradata/XFF/users01.dbf                                  0
     5  /u01/oracle/oradata/XFF/datfttuser.dbf                               0
BBED> set block 1
        BLOCK#          1
BBED> set file 1 block 1
        FILE#           1
        BLOCK#          1
BBED>  p kcvfhckp.kcvcpscn
struct kcvcpscn, 8 bytes                    @484
   ub4 kscnbas                              @484      0x00210f97
   ub2 kscnwrp                              @488      0x0000
BBED> set block 122
        BLOCK#          122
BBED> p ktbbh
struct ktbbh, 48 bytes                      @20
   ub1 ktbbhtyp                             @20       0x01 (KDDBTDATA)
   union ktbbhsid, 4 bytes                  @24
      ub4 ktbbhsg1                          @24       0x00000012
      ub4 ktbbhod1                          @24       0x00000012
   struct ktbbhcsc, 8 bytes                 @28         <==csc(SCN of the last block cleanout)
      ub4 kscnbas                           @28       0x0020770d
      ub2 kscnwrp                           @32       0x0000
   sb2 ktbbhict                             @36       1
   ub1 ktbbhflg                             @38       0x02 (NONE)
   ub1 ktbbhfsl                             @39       0x00
   ub4 ktbbhfnx                             @40       0x00000000
   struct ktbbhitl[0], 24 bytes             @44
      struct ktbitxid, 8 bytes              @44
         ub2 kxidusn                        @44       0x0005  <==回滚段序号
         ub2 kxidslt                        @46       0x0029
         ub4 kxidsqn                        @48       0x0000029a
      struct ktbituba, 8 bytes              @52
         ub4 kubadba                        @52       0x00802381
         ub2 kubaseq                        @56       0x01f9
         ub1 kubarec                        @58       0x03
      ub2 ktbitflg                          @60       0x2001 (KTBFUPB)
      union _ktbitun, 2 bytes               @62
         sb2 _ktbitfsc                      @62       0
         ub2 _ktbitwrp                      @62       0x0000
      ub4 ktbitbas                          @64       0x0020770e <==itl commit scn
BBED> set count 16
        COUNT           16
BBED> m /x 0d772010 offset 28
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/XFF/system01.dbf (1)
 Block: 122              Offsets:   28 to  60           Dba:0x0040007a
------------------------------------------------------------------------
 0d772010 00000000 01000200 00000000
 <32 bytes per line>
BBED> m /x 0e772010 offset 64
 File: /u01/oracle/oradata/XFF/system01.dbf (1)
 Block: 122              Offsets:   64 to  96           Dba:0x0040007a
------------------------------------------------------------------------
 0e772010 00016c00 ffffea00 53046903
 <32 bytes per line>
BBED> p ktbbh
struct ktbbh, 48 bytes                      @20
   ub1 ktbbhtyp                             @20       0x01 (KDDBTDATA)
   union ktbbhsid, 4 bytes                  @24
      ub4 ktbbhsg1                          @24       0x00000012
      ub4 ktbbhod1                          @24       0x00000012
   struct ktbbhcsc, 8 bytes                 @28
      ub4 kscnbas                           @28       0x1020770d
      ub2 kscnwrp                           @32       0x0000
   sb2 ktbbhict                             @36       1
   ub1 ktbbhflg                             @38       0x02 (NONE)
   ub1 ktbbhfsl                             @39       0x00
   ub4 ktbbhfnx                             @40       0x00000000
   struct ktbbhitl[0], 24 bytes             @44
      struct ktbitxid, 8 bytes              @44
         ub2 kxidusn                        @44       0x0005
         ub2 kxidslt                        @46       0x0029
         ub4 kxidsqn                        @48       0x0000029a
      struct ktbituba, 8 bytes              @52
         ub4 kubadba                        @52       0x00802381
         ub2 kubaseq                        @56       0x01f9
         ub1 kubarec                        @58       0x03
      ub2 ktbitflg                          @60       0x2001 (KTBFUPB)
      union _ktbitun, 2 bytes               @62
         sb2 _ktbitfsc                      @62       0
         ub2 _ktbitwrp                      @62       0x0000
      ub4 ktbitbas                          @64       0x1020770e
BBED> sum apply
Check value for File 1, Block 122:
current = 0xc902, required = 0xc902

启动数据库

SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1219160 bytes
Variable Size              92276136 bytes
Database Buffers          218103808 bytes
Redo Buffers                7168000 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced

查看日志

Fri Nov  4 06:50:38 2011
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Fri Nov  4 06:50:38 2011
ALTER DATABASE OPEN
Fri Nov  4 06:50:38 2011
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=16, OS id=7048
Fri Nov  4 06:50:38 2011
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=17, OS id=7050
ARC1: STARTING ARCH PROCESSES
Fri Nov  4 06:50:38 2011
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Fri Nov  4 06:50:38 2011
Thread 1 opened at log sequence 38
  Current log# 3 seq# 38 mem# 0: /u01/oracle/oradata/XFF/redo03.log
Successful open of redo thread 1
Fri Nov  4 06:50:38 2011
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Nov  4 06:50:38 2011
SMON: enabling cache recovery
Fri Nov  4 06:50:38 2011
ARC2: Archival started
ARC1: STARTING ARCH PROCESSES COMPLETE
ARC1: Becoming the heartbeat ARCH
ARC2 started with pid=18, OS id=7052
Fri Nov  4 06:50:38 2011
Errors in file /u01/oracle/admin/XFF/udump/xff_ora_7046.trc:
ORA-00600: internal error code, arguments: [4000], [5], [], [], [], [], [], []
Fri Nov  4 06:50:40 2011
Errors in file /u01/oracle/admin/XFF/udump/xff_ora_7046.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [5], [], [], [], [], [], []
Fri Nov  4 06:50:40 2011
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 7046
ORA-1092 signalled during: ALTER DATABASE OPEN...

在ORACL 11.2的版本中,同样的方法无法重新该错误,数据库依然能够正常打开,所以如果要测试的朋友请选择11G以下版本进行.

TOAD导致ORA-00600[17281]

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:TOAD导致ORA-00600[17281]

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

alert日志报ORA-00600[17281]

Thu Aug  2 01:49:39 2012
Errors in file /oracle9/app/admin/acc/udump/acc2_ora_647350.trc:
ORA-00600: internal error code, arguments: [17281], [1001], [0x7000007AB9F0A30], [], [], [], [], []
ORA-01001: invalid cursor
Thu Aug  2 01:49:39 2012
Errors in file /oracle9/app/admin/acc/udump/acc2_ora_647350.trc:
ORA-00600: internal error code, arguments: [17281], [1001], [0x7000007AB9F0A30], [], [], [], [], []
ORA-01001: invalid cursor
ORA-00600: internal error code, arguments: [17281], [1001], [0x7000007AB9F0A30], [], [], [], [], []
ORA-01001: invalid cursor

分析trace文件

Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production
ORACLE_HOME = /oracle9/app/product/9.2.0
System name:    AIX
Node name:      zwq_acc2
Release:        3
Version:        5
Machine:        00CFFB554C00
Instance name: acc2
Redo thread mounted by this instance: 2
Oracle process number: 1598
Unix process pid: 647350, image: oracle@zwq_acc2 (TNS V1-V3)
*** SESSION ID:(1643.29233) 2012-08-02 00:01:44.423
*** 2012-08-02 00:01:44.423
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [17281], [1001], [0x7000007AB9F0A30], [], [], [], [], []
ORA-01001: invalid cursor
Current SQL statement for this session:
DECLARE
   CURSOR NlsParamsCursor
   IS
      SELECT * FROM nls_session_parameters;
BEGIN
   SELECT NVL (LENGTHB (CHR (65536)), NVL (LENGTHB (CHR (65536)), 1))
     INTO :CharLength
     FROM DUAL;
   FOR NlsRecord IN NlsParamsCursor
   LOOP
      IF NlsRecord.parameter = 'NLS_DATE_LANGUAGE'
      THEN
         :NlsDateLanguage := NlsRecord.VALUE;
      ELSIF NlsRecord.parameter = 'NLS_DATE_FORMAT'
      THEN
         :NlsDateFormat := NlsRecord.VALUE;
      ELSIF NlsRecord.parameter = 'NLS_NUMERIC_CHARACTERS'
      THEN
         :NlsNumericCharacters := NlsRecord.VALUE;
      ELSIF NlsRecord.parameter = 'NLS_TIMESTAMP_FORMAT'
      THEN
         :NlsTimeStampFormat := NlsRecord.VALUE;
      ELSIF NlsRecord.parameter = 'NLS_TIMESTAMP_TZ_FORMAT'
      THEN
         :NlsTimeStampTZFormat := NlsRecord.VALUE;
      END IF;
   END LOOP;
END;
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp+0148          bl       ksedst               1029746FC ?
ksfdmp+0018          bl       01FD4014
kgeriv+0118          bl       _ptrgl
kgeasi+00cc          bl       kgeriv               000000000 ? 110347AE0 ?
                                                   1101A1738 ? 110380110 ?
                                                   110380420 ?
kgicli+0164          bl       kgeasi               110006308 ? 110386088 ?
                                                   438100004381 ? 200000002 ?
                                                   200000002 ? 000000000 ?
                                                   0000003E9 ? 000000002 ?
kgidlt+03a0          bl       kgicli               110287AA0 ? 000000168 ?
kgidel+0018          bl       kgidlt               11003DD08 ? 110062138 ?
                                                   000000069 ? 000000000 ?
                                                   000000000 ?
perabo+00cc          bl       kgidel               000000000 ? 000000000 ?
perdcs+0038          bl       perabo               000000120 ? 000000012 ?
                                                   70000000005C390 ?
peidcs+00ac          bl       perdcs               110002A20 ? 70000000007A450 ?
kkxcls+00bc          bl       peidcs               FFFFFFFFFFFA298 ? 110349708 ?
kxscln+0048          bl       kkxcls               1031628A8 ?
kkscls+0268          bl       kxscln               110349F60 ?
opicca+00f0          bl       kkscls               1009E8034 ? FFFFFFFFFFFA4C0 ?
opiclo+0020          bl       opicca               FFFFFFFFFFFA810 ?
kpoclsa+004c         bl       opiclo               000000002 ?
opiodr+08cc          bl       _ptrgl
ttcpip+0cc4          bl       _ptrgl
opitsk+0d60          bl       ttcpip               11000CF90 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
opiino+0758          bl       opitsk               000000000 ? 000000000 ?
opiodr+08cc          bl       _ptrgl
opidrv+032c          bl       opiodr               3C00000018 ? 4101FAF78 ?
                                                   FFFFFFFFFFFF7A0 ? 0A000F000 ?
sou2o+0028           bl       opidrv               3C0C000000 ? 4A02C3B50 ?
                                                   FFFFFFFFFFFF7A0 ?
main+0138            bl       01FD3A28
__start+0098         bl       main                 000000000 ? 000000000 ?
--------------------- Binary Stack Dump ---------------------
SO: 70000067d802e90, type: 2, owner: 0, flag: INIT/-/-/0x00
  (process) Oracle pid=1598, calls cur/top: 7000006f50c45f0/7000006f50c45f0, flag: (0) -
            int error: 0, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 0 0 0
              last post received-location: No post
              last process to post me: 70000067d5eedf8 2 6
              last post sent: 0 0 0
              last post sent-location: No post
              last process posted by me: none
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: 7000006815fe518
    O/S info: user: oraacc, term: UNKNOWN, ospid: 647350
    OSD pid info: Unix process pid: 647350, image: oracle@zwq_acc2 (TNS V1-V3)
    ----------------------------------------
    SO: 70000068169e7b8, type: 4, owner: 70000067d802e90, flag: INIT/-/-/0x00
    (session) trans: 0, creator: 70000067d802e90, flag: (41) USR/- BSY/-/-/-/-/-
              DID: 0000-0000-00000000, short-term DID: 0000-0000-00000000
              txn branch: 0
              oct: 0, prv: 0, sql: 700000980c32508, psql: 700000980c32508, user: 44/DBRPT
    O/S info: user: wusp9, term: AHMWUSP9, ospid: 2812:1900, machine: AH\AHMWUSP9
              program:
    last wait for 'SQL*Net message from client' blocking sess=0x0 seq=13 wait_time=3647540
                driver id=54435000, #bytes=1, =0
    temporary object counter: 0

这里可以看出当前执行的sql是一种框架中发生,但是从PROCESS STATE未找到登录客户端名称,但是从machine中大概可以估算出来是windows机器.通过查询V$sql找出MODULE

SQL> select module,last_load_time  from V$sql where sql_text like 'declare  cursor NlsParamsCursor is%';
MODULE          LAST_LOAD_TIME
--------------- ----------------------------------
Toad.exe        2012-08-02/00:01:44

从这里可以确定,是因为TOAD登录数据库出现该错误,查询MOS发现相关内容ORA-600 [17281] using TOAD [ID 329531.1]

解决方法

 You can also overcome this by patching the Oracle Client software as follows.:
1. Fix unpublished Bug 4359111, if the Oracle software is 10.1.0.4 or below.
   Patches for this bug can be found on My Oracle Support under Patch:4359111.
2. Fix unpublished bug 5910901 by applying Patch:5910901 if the Oracle software is 10.1.0.5 or above,
3. Apply the 10.2.0.4 patch set or above on the client.
4. You can change the non-Oracle client software being used,
5. You can just ignore the error as it is non-corruptive.
6. Run the queries directly through SQLPLUS instead of using TOAD.
**Ensure that your version of TOAD is compatible with the version of the database as well.
If the errors continue when using TOAD, but not with SQL*Plus, contact TOAD support to request a fix.**

ORA-00600[kgscLogOff-notempty]

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-00600[kgscLogOff-notempty]

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在查看一网友alert日志时发现很多ORA-00600[kgscLogOff-notempty]记录,通过查询MOS分析记录结果如下:
alert日志出现ORA-00600[kgscLogOff-notempty]

Wed Aug  1 10:28:35 2012
Errors in file /vgdata/oracle/admin/orcl1/udump/orcl1_ora_18430.trc:
ORA-00600: internal error code, arguments: [kgscLogOff-notempty], [1], [], [], [], [], [], []
Wed Aug  1 10:28:38 2012
Errors in file /vgdata/oracle/admin/orcl1/udump/orcl1_ora_18508.trc:
ORA-00600: internal error code, arguments: [kgscLogOff-notempty], [3], [], [], [], [], [], []

trace结合MOS分析原因

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /vgdata/oracle/product
System name:	HP-UX
Node name:	hpdc1
Release:	B.11.31
Version:	U
Machine:	ia64
Instance name: orcl1
Redo thread mounted by this instance: 1
Oracle process number: 230
Unix process pid: 18430, image: oracleorcl1@hpdc1
*** SERVICE NAME:(orcl1) 2012-08-01 10:28:35.937
*** SESSION ID:(191.141) 2012-08-01 10:28:35.937
*** 2012-08-01 10:28:35.937
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kgscLogOff-notempty], [1], [], [], [], [], [], []
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+64          call     _etext_f()+23058430  000000000 ? 000000001 ?
                              09017233648
ksedmp()+1680        call     _etext_f()+23058430  000000000 ?
                              09017233648          C000000000000D20 ?
                                                   40000000052C8C90 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ?
ksfdmp()+48          call     _etext_f()+23058430  000000003 ?
                              09017233648
kgerinv()+400        call     _etext_f()+23058430  40000000093F4370 ?
                              09017233648          000000003 ?
                                                   C000000000000612 ?
                                                   000008F07 ? 000000000 ?
                                                   000000000 ?
kgeasnmierr()+144    call     _etext_f()+23058430  6000000000015190 ?
                              09017233648          6000000000016248 ?
                                                   6000000000013770 ?
                                                   600000000009C808 ?
                                                   60000000000165B0 ?
$cold_kgscLogOff()+  call     _etext_f()+23058430  6000000000015190 ?
144                           09017233648          9FFFFFFFBF562160 ?
                                                   9FFFFFFFBF562170 ?
                                                   60000000000165C0 ?
                                                   000000000 ? 000000001 ?
kkslof()+320         call     _etext_f()+23058430  6000000000015190 ?
                              09017233648
opifcs()+592         call     _etext_f()+23058430  C000000360365908 ?
                              09017233648          C000000000000E21 ?
                                                   4000000002E1BF20 ?
                                                   000000000 ? 000000000 ?
ksuxds()+1568        call     _etext_f()+23058430  C000000360365908 ?
                              09017233648          4000000002EBA7A0 ?
                                                   000008F9F ? 000004810 ?
                                                   9FFFFFFFBF56BEF6 ?
                                                   C000000360365908 ?
                                                   4000000001354140 ?
                                                   00000003F ?
ksudel()+128         call     _etext_f()+23058430  60000000000A7B54 ?
                              09016643752          60000000000A7CF8 ?
                                                   9FFFFFFFFFFF6C20 ?
                                                   600000000009D8A0 ?
opilof()+3664        call     _etext_f()+23058430  60000000000A7B54 ?
                              09016643752          60000000000A7CF8 ?
                                                   4000000003D681E0 ?
                                                   C00000000000122A ?
                                                   00000814F ?
                                                   60000000000A7CE0 ?
opiodr()+2016        call     opilof()+3664        60000000002BE7B0 ?
                                                   60000000002BE7B8 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFF71B0 ?
                                                   600000000009D8A0 ?
                                                   60000000002BE7E0 ?
ttcpip()+1824        call     _etext_f()+23058430  60000000000AA0BC ?
                              09017243504          000000001 ?
                                                   6000000000015310 ?
                                                   6000000000015310 ?
                                                   9FFFFFFFFFFF8250 ?
                                                   600000000009D8A0 ?
                                                   9FFFFFFFFFFF8250 ?
                                                   6000000000020D88 ?
opitsk()+2224        call     _etext_f()+23058430  6000000000020D80 ?
                              09017192128          9FFFFFFFFFFFA3D4 ?
                                                   6000000000020EE8 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFFAA90 ?
                                                   9FFFFFFFFFFFA884 ?
                                                   6000000000020E88 ?
                                                   000000000 ?
opiino()+2144        call     _etext_f()+23058430  000000000 ? 000000000 ?
                              09016681384          600000000009D8A0 ?
                                                   4000000002AE7E10 ?
                                                   000008001 ?
                                                   9FFFFFFFFFFFA874 ?
opiodr()+2016        call     _etext_f()+23058430  00000003C ?
                              09016681384          9FFFFFFFFFFFF260 ?
                                                   9FFFFFFFFFFFF250 ?
                                                   9FFFFFFFFFFFBF90 ?
                                                   0000000B0 ?
                                                   600000000008F560 ?
opidrv()+1136        call     _etext_f()+23058430  60000000000AA0BC ?
                              09017243504          000000001 ?
                                                   6000000000015310 ?
                                                   6000000000015310 ?
                                                   9FFFFFFFFFFFCAE0 ?
                                                   600000000009D8A0 ?
sou2o()+240          call     9fffffffffffcae0     00000003C ? 000000004 ?
                                                   9FFFFFFFFFFFF250 ?
opimai_real()+480    call     9fffffffffffcae0     9FFFFFFFFFFFF270 ?
                                                   00000003C ? 000000004 ?
                                                   9FFFFFFFFFFFF250 ?
main()+352           call     9fffffffffffcae0     000000000 ?
                                                   9FFFFFFFFFFFF2A0 ?
main_opd_entry()+80  call     9fffffffffffcae0     000000002 ?
                                                   9FFFFFFFFFFFF750 ?
                                                   C000000000033910 ?
                                                   000000000 ?
--------------------- Binary Stack Dump ---------------------

通过查询MOS发现ORA-600 [kgscLogOff-notempty] On Session Logoff [ID 413120.1]中记录有

The following error may occur on session logoff
ORA-600 [kgscLogOff-notempty], [1]
or ORA-600 [kgscLogOff-notempty], [2]
with a Call Stack like:
... kgscLogOff  kkslof opifcs ksuxds ksudel ...

和该数据库中的Stack Dump记录一致,数据库版本也在该bug范围内,从而可以确定是unpublished Bug 4887675

解决方案
该错误是在logoff时产生,对数据库影响不大,如果偶尔出现可以忽略,出现频率过高,建议按照下列方法解决

There is No Data Corruption (or) Data Loss due to this Error.
This error occurs during the logoff operation, hence the impact should be negligible.
This is fixed in 10.2.0.4 and above releases.
One-off patches may be available depending on your current release and operating system.
To obtain a patch from MetaLink:
1) Click on Patches.
2) Click on Simple Search
3) Enter patch number: 4887675
4) Select your O/S
5) Click Go.
As workaround, avoid  Killing/Terminating the Session Abnormally.