ORA-600 12807(CON$.CON#达到最大值) 处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 12807(CON$.CON#达到最大值) 处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

这次阳了有点严重,客户现场打patch无法去,在家里远程值守,在电脑前面闲着就查询和重现了最近朋友和我说的他们的客户遇到ORA-600 12807的故障.查询了下mos,基本上可以确认是由于CON$.CON#达到理论最大值无法继续增加从而报该错误,参考文档:
Mechanism to Recycle Database Constraint Identifiers (Doc ID 2925056.1)
Bug 13781691 – ORA-600 [12807] if CON$.CON# very high due to bug 13784384 (Doc ID 13781691.8)
Bug 25343563 – Mechanism to Implement Constraint Identifier (con#) Recycling (Doc ID 25343563.8)
在12及其之后的版本中oracle发布了patch 25343563 并设置event启用该patch进行解决.但是如果是12c之前版本,官方没有提供直接的解决方案.最基本的解决方法就是进行数据逻辑迁移,以及避免频繁创建约束导致con$.con#消耗太大

通过试验重现该错误

SQL> create table t_xff  (id number not null,name varchar2(100) not null);
create table t_xff  (id number not null,name varchar2(100) not null)
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [12807], [], [], [], [], [], [], [],
[], [], [], []

SQL>  create table t_xff  (id number,name varchar2(100));

Table created.

SQL> alter table t_xff add primary key(id);
alter table t_xff add primary key(id)
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [12807], [], [], [], [], [], [], [],
[], [], [], []


SQL> select con# from sys.con$ where name='_NEXT_CONSTRAINT';

      CON#
----------
4294967294

通过一些底层分析,并对部分底层基表进行patch实现数据库可以继续创建约束

SQL>  alter table t_xff add primary key(id);

Table altered.

使用临时的patch方法,可以快速的恢复业务,后续找适当时间点安排迁移.
在此提醒:对于一些创建中间对象或者临时对象频繁的系统(特别是大量主键,not null等)注意检查该值距离天花板距离,如果比较接近了最好安排一次逻辑迁移和找出来原因(是oracle bug还是应用触发)

ORA-600 ksuloget2 恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 ksuloget2 恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户在win 32位的操作系统上调至sga超过2G,数据库运行过程中报ORA-600 ksuloget2错误

Thread 1 cannot allocate new log, sequence 43586
Checkpoint not complete
  Current log# 1 seq# 43585 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Fri Aug 04 14:57:02 2023
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_42996.trc  (incident=67481):
ORA-00600: 内部错误代码, 参数: [ksuloget2], [0xFEBA6208], [0xFEBA3B08], [500], [0xFEBA622C], [], [], [], [], []
Thread 1 advanced to log sequence 43586 (LGWR switch)
  Current log# 2 seq# 43586 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG

重启数据库,进行尝试恢复继续报ORA-600 ksuloget2

Thu Aug 17 17:38:27 2023
ALTER DATABASE RECOVER  database using backup controlfile  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 24 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile  ...
Thu Aug 17 17:39:01 2023
ALTER DATABASE RECOVER LOGFILE 'D:\oracle\flash_recovery_area\orcl\ARCHIVELOG\2023_08_04\REDO03.LOG'  
Media Recovery Log D:\oracle\flash_recovery_area\orcl\ARCHIVELOG\2023_08_04\REDO03.LOG
Thu Aug 17 17:39:01 2023
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110724):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00342: archived log does not have expected resetlogs SCN 685171428
ORA-00334: archived log: 'D:\ORACLE\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2023_08_04\REDO03.LOG'
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc:
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00342: archived log does not have expected resetlogs SCN 685171428
ORA-00334: archived log: 'D:\ORACLE\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2023_08_04\REDO03.LOG'
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_5604.trc  (incident=110709):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00342: archived log does not have expected resetlogs SCN 685171428
ORA-00334: archived log: 'D:\ORACLE\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2023_08_04\REDO03.LOG'
Incident details in: d:\oracle\diag\rdbms\orcl\orcl\incident\incdir_110709\orcl_ora_5604_i110709.trc
ORA-600 signalled during:ALTER DATABASE RECOVER LOGFILE 'D:\oracle\flash_recovery_area\orcl\2023_08_04\REDO03.LOG'
ALTER DATABASE RECOVER CANCEL 
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110725):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110726):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc:
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110727):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110728):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_5604.trc  (incident=110710):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [
Incident details in: d:\oracle\diag\rdbms\orcl\orcl\incident\incdir_110710\orcl_ora_5604_i110710.trc

由于是应用日志失败,屏蔽日志一致性,强制打开数据库,检查数据ok,业务可以直接使用,对于这类问题,官方建议:ORA-600: [Ksuloget2] Hit on Windows When SGA Greater Than 1G (Doc ID 836109.1)
20230819105750


ORA-10485: Real-Time Query cannot be enabled while applying migration redo.

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-10485: Real-Time Query cannot be enabled while applying migration redo.

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

对于数据库打psu和jvm patch之后,dg备库同步会出现类似ORA-10485: Real-Time Query cannot be enabled while applying migration redo.异常

Tue Aug 15 18:48:18 2023
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Attempt to start background Managed Standby Recovery process (orcl)
Tue Aug 15 18:48:18 2023
MRP0 started with pid=33, OS id=15486
MRP0: Background Managed Standby Recovery process started (orcl)
 started logmerger process
Tue Aug 15 18:48:23 2023
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 80 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Recovery of Online Redo Log: Thread 1 Group 10 Seq 106115 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/orcl/std_redo10.log
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Tue Aug 15 18:49:50 2023
RFS[1]: Assigned to RFS process 16049
RFS[1]: Opened log for thread 1 sequence 106117 dbid 1490144467 branch 962363734
Tue Aug 15 18:49:50 2023
RFS[2]: Assigned to RFS process 16051
RFS[2]: Selected log 10 for thread 1 sequence 106115 dbid 1490144467 branch 962363734
Tue Aug 15 18:49:50 2023
RFS[3]: Assigned to RFS process 16053
RFS[3]: Opened log for thread 1 sequence 106116 dbid 1490144467 branch 962363734
Archived Log entry 106102 added for thread 1 sequence 106116 rlc 962363734 ID 0x58d223d3 dest 2:
RFS[3]: Opened log for thread 1 sequence 106118 dbid 1490144467 branch 962363734
RFS[2]: Opened log for thread 1 sequence 106119 dbid 1490144467 branch 962363734
Tue Aug 15 18:49:50 2023
Archived Log entry 106103 added for thread 1 sequence 106115 ID 0x58d223d3 dest 1:
Tue Aug 15 18:49:50 2023
Media Recovery Log /u01/app/oracle/fast_recovery_area/ORCLDG/archivelog/2023_08_15/o1_mf_1_106116_lfpp2ghc_.arc
Errors with log /u01/app/oracle/fast_recovery_area/ORCLDG/archivelog/2023_08_15/o1_mf_1_106116_lfpp2ghc_.arc
MRP0: Background Media Recovery terminated with error 10485
Errors in file /u01/app/oracle/diag/rdbms/orcldg/orcl/trace/orcl_pr00_15488.trc:
ORA-10485: Real-Time Query cannot be enabled while applying migration redo.
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Tue Aug 15 18:49:50 2023
MRP0: Background Media Recovery process shutdown (orcl)

这种情况,是由于数据库在upgrade模式下产生日志无法正常被adg实时应用,我一般是临时rman应用备库日志解决,官方解决方案:MRP process getting terminated with error ORA-10485 (Doc ID 1618485.1)

11g CASE

---------

1> Stop DG broker (if used)
   i.e., on primary and standby

SQL> alter system set dg_broker_start = false scope = both sid = '*' ;
2> Stop managed recovery in the standby, shutdown the standby and startup mount.
    start managed recovery without real time apply.

SQL> alter database recover managed standby database disconnect ;
3> Wait until all the redo is applied to the standby and the standby is in sync.
     Do a couple of log switches on the primary, all instances if RAC, and let them apply to the standby.

4> Shutdown the standby and startup mount
   start managed recovery with real time apply.

SQL> alter database recover managed standby database using current logfile disconnect ; 
5> Restart broker(if used).

on primary and standby

SQL> alter system set dg_broker_start = true scope = both sid = '*' ;
 

12c CASE

----------

In 12c and later, start Archived log apply using below command with ARCHIVED LOGFILE option:

SQL> alter database recover managed standby database using archived logfile disconnect;
Wait until all the redo is applied to the standby and the standby is in sync. 
  Do a couple of log switches on the primary, all instances if RAC, and let them apply to the standby.

To stop Redo Apply, Issue the following SQL statement:
SQL>  alter database recover managed standby database cancel; 

start managed recovery with real time apply, Issue the following SQL statement

SQL> alter database recover managed standby database disconnect;

ORA-600 kghstack_underflow_internal_2

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 kghstack_underflow_internal_2

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

aix平台运行11.2.0.4 rac,突然一个节点crash,lms2进程报ORA-600 kghstack_underflow_internal_2错误

Thu Aug 03 18:43:16 2023
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc  (incident=761244):
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/oracle/app/oracle/diag/rdbms/xff/xff2/incident/incdir_761244/xff2_lms2_2884404_i761244.trc
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc  (incident=761245):
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11AB5BBF0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/oracle/app/oracle/diag/rdbms/xff/xff2/incident/incdir_761245/xff2_lms2_2884404_i761245.trc
Thu Aug 03 18:43:19 2023
Dumping diagnostic data in directory=[cdmp_20230803184319], requested by (instance=2, osid=2884404 (LMS2)), summary=[incident=761245].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Thu Aug 03 18:43:23 2023
Sweep [inc][761245]: completed
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc:
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
Sweep [inc][761244]: completed
Sweep [inc2][761245]: completed
Sweep [inc2][761244]: completed
Thu Aug 03 18:43:29 2023
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc:
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
LMS2 (ospid: 2884404): terminating the instance due to error 484

分析trace文件中的Call Stack Trace信息

----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
skdstdst()+40        bl       0000000109B3EE38     000000000 ? 000000001 ?
                                                   000000003 ? 000000000 ?
                                                   000000000 ? 000000001 ?
                                                   000000003 ? 000000000 ?
ksedst1()+112        call     skdstdst()           1777D9901C4FD34D ?
                                                   4840284100000000 ?
                                                   FFFFFFFFFFECE20 ?
                                                   2A501377F67A7 ? 10A742204 ?
                                                   000000000 ? 1107486C0 ?
                                                   2050033FFFECE28 ?
ksedst()+40          call     ksedst1()            FFFFFFFFFFFE0002 ?
                                                   0000060F1 ? 000000001 ?
                                                   10A46AD18 ? 000000000 ?
                                                   000000000 ? 000002004 ?
                                                   000000001 ?
dbkedDefDump()+1516  call     ksedst()             000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 300000003 ?
ksedmp()+72          call     dbkedDefDump()       3107486C0 ? 110000A28 ?
                                                   FFFFFFFFFFED630 ? 1106ABC70 ?
                                                   100125778 ? FFFFFFFFFFED5B0 ?
                                                   FFFFFFFFFFEDA30 ? 1106ABC70 ?
ksfdmp()+100         call     ksedmp()             000000002 ? 000000000 ?
                                                   000000002 ? 10AF71A68 ?
                                                   10A0720F8 ? 000000000 ?
                                                   1108EC608 ? 1107486C0 ?
dbgexPhaseII()+1904  call     ksfdmp()             FFFFFFFFFFFE0002 ?
                                                   0000060F1 ? 000000002 ?
                                                   000000000 ? 000000002 ?
                                                   10A0720F0 ? 000000000 ?
                                                   001050005 ?
dbgexProcessError()  call     dbgexPhaseII()       1107486C0 ? 1108EFB28 ?
+1556                                              0000B9D9D ? 200000000 ?
                                                   FFFFFFFFFFEE548 ? 000000104 ?
                                                   FFFFFFFFFFEDBB0 ?
                                                   FB400000000 ?
dbgeExecuteForError  call     dbgexProcessError()  1107486C0 ? 1108EC608 ?
()+72                                              100000000 ? 000000000 ?
                                                   FFFFFFFFFFF29E0 ?
                                                   2840288000000012 ?
                                                   10013DA4C ? 1108EE350 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  000000002 ? 000000128 ?
2044                          ()                   FFFFFFFFFFFE0002 ?
                                                   215265335E5162 ?
                                                   3726000000000001 ?
                                                   10A46AD18 ? 10A46CB00 ?
                                                   FFFFFFFFFFF1D30 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   000000001 ? 10A46AD18 ?
68                                                 25800000000 ? 109E7A740 ?
                                                   000000000 ? 000000038 ?
                                                   FFFFFFFFFFF2800 ? 11AB1AC50 ?
kgeadse()+380        call     dbkePostKGE_kgsf()   900000000512C74 ?
                                                   9001000A008DAD0 ? 000000000 ?
                                                   9001000A008DAD0 ?
                                                   8000000FFFF2C40 ?
                                                   7000147E8F28C98 ? 400000008 ?
                                                   1100054A0 ?
kgerinv_internal()+  call     kgeadse()            7FFFFFFFFFFFFFFF ?
48                                                 FFFFFFFFFFFEF8FF ?
                                                   000000019 ? 110476528 ?
                                                   000000001 ? 000000017 ?
                                                   00000000B ? 000000000 ?
kgerinv()+48         call     kgerinv_internal()   FFFFFFFFFFFEF8FF ?
                                                   FFFFFFFFFFFFFFFF ?
                                                   FFFFFFFFFFFFFFFF ?
                                                   7FFFFFFFFFFFFFFF ?
                                                   1001648E0 ? FFFFFFFFFFF25E0 ?
                                                   1106ABC70 ? 11073B3C0 ?
kgeasnmierr()+72     call     kgerinv()            000000000 ? 215265335E5162 ?
                                                   372600383A0F5000 ?
                                                   000000004 ? 10A328F7C ?
                                                   FFFFFFFFFFF2898 ? 000000002 ?
                                                   0FFFFFFFF ?
kghstack_underflow_  call     kgeasnmierr()        11AB967A0 ? 000000000 ?
internal()+280                                     FFFFFFFFFFF2860 ? 100000001 ?
                                                   000000002 ? 11AB5BBF0 ?
                                                   000000000 ? 11AB96778 ?
kghstack_free()+716  call     kghstack_underflow_  10A328F7C ? 110A2FEC0 ?
                              internal()           000000004 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000080 ? 80000000000000 ?
ktudda()+912         call     kghstack_free()      11AB5BBF0 ? 7215265335E5162 ?
                                                   3726000000000008 ?
                                                   000000102 ? 109E747E0 ?
                                                   FFFFFFFFFFF2A90 ? 000000048 ?
                                                   28408880FFFFFFFF ?
kcbtdu()+1636        call     ktudda()             70001383A0F4014 ? 000000000 ?
                                                   1FE800000000 ? 07F7F7F7F ?
                                                   FFFFFFFF80808080 ?
                                                   000000000 ? 000000030 ?
                                                   FFFFFFFFFFF2B30 ?
kcbzdh()+3200        call     kcbtdu()             35900000359 ? 100000001 ?
                                                   000000001 ? 200000001 ?
                                                   000000001 ? 00000005D ?
                                                   200066665D20 ? 000000000 ?
kcbzpnd()+504        call     kcbzdh()             70001383F6D64B8 ? 000002004 ?
                                                   2107486C0 ? 10A74269E ?
                                                   1107486C0 ? FFFFFFFFFFF3B30 ?
                                                   FFFFFFFFFFF38E0 ? 000000000 ?
kcbdnb()+724         call     kcbzpnd()            10A74267C ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 0001CE860 ?
                                                   000000000 ? 000000000 ?
dbkedDefDump()+5528  call     kcbdnb()             200000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   1100224D0 ? 000000018 ?
                                                   110001366 ? 000000000 ?
ksedmp()+72          call     dbkedDefDump()       3107486C0 ? 110000A28 ?
                                                   FFFFFFFFFFF3FC0 ? 1106ABC70 ?
                                                   100125778 ? 000000000 ?
                                                   FFFFFFFFFFF3FB0 ? 1106ABC70 ?
ksfdmp()+100         call     ksedmp()             000000002 ? 000000000 ?
                                                   000000002 ? 10AF71A68 ?
                                                   10A0720F8 ? 000000000 ?
                                                   1109DE650 ? 1107486C0 ?
dbgexPhaseII()+1904  call     ksfdmp()             11074B65C ? 000000001 ?
                                                   000000002 ? 000000000 ?
                                                   000000002 ? 10A0720F0 ?
                                                   000000000 ? 001050005 ?
dbgexProcessError()  call     dbgexPhaseII()       1107486C0 ? 1109DC860 ?
+1556                                              0000B9D9C ? 200000000 ?
                                                   FFFFFFFFFFF4ED8 ? 000000082 ?
                                                   FFFFFFFFFFF4560 ?
                                                   88A4422A00000000 ?
dbgeExecuteForError  call     dbgexProcessError()  1107486C0 ? 1109DE650 ?
()+72                                              100000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   0DFFFFFFF ? 1109E0398 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  00000000A ? 000000000 ?
2044                          ()                   000000001 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFFB4E0 ? 000000000 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   000000000 ? FFFFFFFFFFF96B0 ?
68                                                 2580000000A ? 109E7A740 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFF9190 ? 11AB1AC50 ?
kgeadse()+380        call     dbkePostKGE_kgsf()   000000001 ? 000000008 ?
                                                   000000000 ? 10A30EA38 ?
                                                   110000C20 ? 700014771160D68 ?
                                                   700014772ADB3A8 ? 000000001 ?
kgerinv_internal()+  call     kgeadse()            000000003 ? 000000000 ?
48                                                 11074B65C ? 000000001 ?
                                                   000000000 ? FFFFFFFFFFF96B0 ?
                                                   00000000A ? 000000001 ?
kgerinv()+48         call     kgerinv_internal()   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
kgeasnmierr()+72     call     kgerinv()            000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFF92B0 ?
                                                   48102840FFFFA5B0 ?
                                                   11AB5BBB8 ? 11074D658 ?
kghstack_underflow_  call     kgeasnmierr()        022028200 ? 022202820 ?
internal()+280                                     11AB5BBB8 ? 100000001 ?
                                                   000000002 ? 11074D658 ?
                                                   0442C2394 ? 000002000 ?
kghstack_free()+716  call     kghstack_underflow_  FFFFFFFFFFF92B0 ?
                              internal()           FFFFFFFFFFF95B8 ?
                                                   FFFFFFFFFFF92B0 ? 000000001 ?
                                                   FFFFFFFFFFF92B0 ?
                                                   FFFFFFFFFFF95E8 ?
                                                   FFFFFFFFFFF95B8 ? 11074B650 ?
ktundo()+924         call     kghstack_free()      0DEADBEEF ? 11074D668 ?
                                                   11074B654 ? 300000000 ?
                                                   1FFFFB4E0 ? FFFFFFFFFFFB4E0 ?
                                                   FFFFFFFFFFF94C0 ?
                                                   FFFFFFFFFFF9470 ?
kturCRBackoutOneChg  call     ktundo()             19FFFFB5E0 ?
()+848                                             494CEDB3FFFF9E50 ?
                                                   FFFFFFFFFFF9E48 ? 000000000 ?
                                                   000000000 ? FFFFFFFFFFFA5B0 ?
                                                   100000000 ? FFFFFFFFFFFB4E0 ?
ktrgcm()+5816        call     kturCRBackoutOneChg  FFFFFFFFFFFA5B0 ?
                              ()                   19FFFFA440 ?
                                                   FFFFFFFFFFFA5B8 ? 000000000 ?
                                                   1FFFFA478 ? FFFFFFFFFFFB4E0 ?
                                                   000000000 ? 000000000 ?
ktrget3()+832        call     ktrgcm()             FFFFFFFFFFFAC80 ? 000000000 ?
                                                   000000000 ? 000000003 ?
                                                   058F7501F ? 000000001 ?
                                                   000000004 ? 000000003 ?
ktrget2()+104        call     ktrget3()            000000002 ? 700000000014488 ?
                                                   7000147E9C41A50 ? 000000022 ?
                                                   110A123A0 ? 000000000 ?
                                                   FFFFFFFFFFFB080 ? 110A123B8 ?
kclgeneratecr()+654  call     ktrget2()            FFFFFFFFFFFB4D0 ? 110AA1610 ?
0                                                  14F11E4E00 ? 0F11E4E00 ?
                                                   357FED028 ? 000030000 ?
                                                   7000147E9C41A50 ?
                                                   700000000014488 ?
kclgcr()+812         call     kclgeneratecr()      11A209508 ? FFFFFFFFFFFBFC0 ?
                                                   FFFFFFFFFFFBC18 ? 000000000 ?
                                                   0FFFFBB10 ? 01A275AC8 ?
                                                   1761D7F302ED25AC ?
                                                   20000011A275AC8 ?
kclcrrf()+536        call     kclgcr()             FFFFFFFFFFFBC20 ?
                                                   FFFFFFFFFFFBD00 ? 101F5080C ?
                                                   000000000 ? 0000003E8 ?
                                                   000000028 ? 0000000C8 ?
                                                   FFFFFFFFFFFBF88 ?
kjblcrcbk()+896      call     kclcrrf()            000000001 ? 000000000 ?
                                                   7000147EB0F07B8 ?
                                                   7000147576C4471 ?
                                                   401472C30C7F0 ?
                                                   7000147576C4408 ?
                                                   7000147576C3190 ?
                                                   7000147576C7170 ?
kjblpcr()+304        call     kjblcrcbk()          FFFFFFFFFFFBDA8 ? 000000038 ?
                                                   7000147FABBDB48 ? 600000006 ?
                                                   000000016 ? 11A209468 ?
                                                   000000013 ? 0001C2153 ?
kjbmpbast()+1792     call     kjblpcr()            000000012 ? 000000168 ?
                                                   000000002 ? 70001109FDB8148 ?
                                                   357000000000357 ?
                                                   7000144F31F7750 ?
                                                   895000000000895 ? 000000000 ?
kjmxmpm()+760        call     kjbmpbast()          1000000000000 ? 80000001E ?
                                                   000000000 ? 11A2951C8 ?
                                                   C000000000 ? 000000000 ?
                                                   1000000000000 ? 000000000 ?
kjmpbmsg()+3508      call     kjmxmpm()            000000000 ? 11A3769E0 ?
                                                   FFFFFFFFFFFC380 ? 06DBFBAEF ?
                                                   101E13820 ? 11A3769E0 ?
                                                   7000147E339AE08 ?
                                                   FFFFFFFFFFFC210 ?
kjmsm()+13416        call     kjmpbmsg()           11A209448 ? 7000147E339AE08 ?
                                                   100000019 ? 100000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 7000000000168FD ?
ksbrdp()+2216        call     kjmsm()              7000000000168E0 ?
                                                   7000000000168FC ? 048244028 ?
                                                   000000E00 ? 1108B69F0 ?
                                                   100637768 ? 000000001 ?
                                                   700000007 ?
opirip()+1620        call     ksbrdp()             FFFFFFFFFFFFE22 ? 10AFA5FC8 ?
                                                   FFFFFFFFFFFDC10 ? 000000000 ?
                                                   000000001 ? 000000000 ?
                                                   01380038F ? 000000001 ?
opidrv()+608         call     opirip()             10AFA23B0 ? 410134118 ?
                                                   FFFFFFFFFFFED80 ?
                                                   2F7530312F ? 108A7E8C4 ?
                                                   1106ABC70 ?
                                                   652F70726F647563 ?
                                                   1106ABC70 ?
sou2o()+136          call     opidrv()             3208A885B0 ? 400000000 ?
                                                   FFFFFFFFFFFED80 ?
                                                   23001801CD0000 ? 000000010 ?
                                                   1106ABC70 ? 000000000 ?
                                                   000000000 ?
opimai_real()+188    call     sou2o()              FFFFFFFFFFFEDF0 ?
                                                   4424444B00000001 ?
                                                   9000000000D73CC ?
                                                   BADC0FFEE0DDF00D ?
                                                   000000003 ? 9001000A008DAD0 ?
                                                   A0000000A000000 ? 10B6A8F30 ?
ssthrdmain()+276     call     opimai_real()        9001000A0011A60 ?
                                                   FFFFFFFFFFFF148 ?
                                                   FFFFFFFFFFFEEF0 ? 10B6E9280 ?
                                                   90000000008582C ?
                                                   9001000A008DAD0 ?
                                                   FFFFFFFFFFFEED0 ?
                                                   9001000A008DAD0 ?
main()+204           call     ssthrdmain()         3F0003660 ? FFFFFFFFFFFF238 ?
                                                   FFFFFFFFFFFF2A0 ?
                                                   9FFFFFFF000D658 ?
                                                   9FFFFFFF00009A0 ? 000000000 ?
                                                   000000000 ? 9FFFFFFF000D658 ?
__start()+112        call     main()               000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
 

--------------------- Binary Stack Dump ---------------------

查询mos对比相关信息,参考: LMON or LMS Process Crashes Instance With ORA-600 [kghstack_underflow_internal_2] (Doc ID 2003278.1)信息

The LMON or LMS process crash the instance with an error like:

ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x110A10838], [], [], [], [], [], [], [], [], [], []

ORA-1092 : opitsk aborting process
Instance terminated by LMS1, pid = 14024818
 

Review of the generated tracefiles reveals a call stack similar to:

... kghstack_underflow_internal kghstack_free kccgrd kjxgrf_rr_read kjxgrDD_rr_read kjxgrimember kjxggpoll kjfmact kjfdact kjfcln ksbrdp ...
  
- OR -
  
... kghstack_underflow_internal kghstack_free ktundo kturcrbackoutonechg ktrgcm ktrget3 ktrget2 kclgcr ...

确认为Bug 18687067 – ORA-600 [KGHSTACK_UNDERFLOW_INTERNAL_2] closed as duplicate of Bug 20675347 – ORA-07445 [KGHSTACK_OVERFLOW_INTERNAL()+644](The bug is caused by an AIX compiler issue causing volatile variables in the Oracle kernel not to be handled properly.),解决方案升级数据库到12.1及其以上版本或者打上patch 20675347

ORA-07445 opiaba—绑定变量超过65535导致实例crash

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-07445 opiaba—绑定变量超过65535导致实例crash

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库异常报ORA-07445 opiaba,ORA-00600 17147错,导致实例crash

Wed Mar 15 09:48:06 2023
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x187B953, opiaba()+639] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_ora_169909.trc  (incident=446531):
ORA-07445: 出现异常错误: 核心转储 [opiaba()+639] [SIGSEGV] [ADDR:0x0] [PC:0x187B953] [SI_KERNEL(general_protection)] []
Incident details in: /u01/app/oracle/diag/rdbms/xifenfei/xifenfei3/incident/incdir_446531/xifenfei3_ora_169909_i446531.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Mar 15 09:48:07 2023
Dumping diagnostic data in directory=[cdmp_20230315094807], requested by (instance=3, osid=169909), summary=[incident=446531].
Wed Mar 15 09:48:08 2023
Sweep [inc][446531]: completed
Sweep [inc2][446531]: completed
Wed Mar 15 09:48:33 2023
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_pmon_161557.trc  (incident=440035):
ORA-00600: internal error code, arguments: [17147], [0x4AFC25D0C8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/xifenfei/xifenfei3/incident/incdir_440035/xifenfei3_pmon_161557_i440035.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Mar 15 09:48:33 2023
Dumping diagnostic data in directory=[cdmp_20230315094833], requested by (instance=3, osid=161557 (PMON)), summary=[incident=440035].
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_pmon_161557.trc:
ORA-00600: internal error code, arguments: [17147], [0x4AFC25D0C8], [], [], [], [], [], [], [], [], [], []
PMON (ospid: 161557): terminating the instance due to error 472
Wed Mar 15 09:48:34 2023
opiodr aborting process unknown ospid (170089) as a result of ORA-1092

对应的trace文件信息

*** 2023-03-15 09:09:34.862
*** SESSION ID:(1858.63187) 2023-03-15 09:09:34.862
*** CLIENT ID:() 2023-03-15 09:09:34.862
*** SERVICE NAME:(xifenfei) 2023-03-15 09:09:34.862
*** MODULE NAME:(JDBC Thin Client) 2023-03-15 09:09:34.862
*** ACTION NAME:() 2023-03-15 09:09:34.862
 
Dump continued from file: /u01/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_ora_116886.trc
ORA-07445: 出现异常错误: 核心转储 [opiaba()+639] [SIGSEGV] [ADDR:0x0] [PC:0x187B953] [SI_KERNEL(general_protection)] []

========= Dump for incident 326049 (ORA 7445 [opiaba()+639]) ========
----- Beginning of Customized Incident Dump(s) -----
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x187B953, opiaba()+639] [flags: 0x0, count: 1]
Registers:
%rax: 0x00000046fa212588 %rbx: 0x00000046762ee798 %rcx: 0x0000000000003200
%rdx: 0x0000000000000000 %rdi: 0x0000000000007340 %rsi: 0x0000000000007340
%rsp: 0x00007fffffff41d0 %rbp: 0x00007fffffff4240  %r8: 0x00000044a8599f00
 %r9: 0x0000000000000099 %r10: 0x0000000000000b34 %r11: 0x00000046fa211720
%r12: 0x00000046fa5c7488 %r13: 0x0000000000000000 %r14: 0x0200000046fa2125
%r15: 0x0000000000000005 %rip: 0x000000000187b953 %efl: 0x0000000000010202
  opiaba()+625 (0x187b945) jmp 0x187b982
  opiaba()+627 (0x187b947) movzbl 0x2c(%rbx),%r15d
  opiaba()+632 (0x187b94c) mov %r14,-0x60(%rbp)
  opiaba()+636 (0x187b950) mov %rax,%r14
> opiaba()+639 (0x187b953) movswq 0xa(%r14),%rcx
  opiaba()+644 (0x187b958) cmp %ecx,%r15d
  opiaba()+647 (0x187b95b) jne 0x187b976
  opiaba()+649 (0x187b95d) mov 0x18(%rbx),%rdi
  opiaba()+653 (0x187b961) lea 0xc(%r14),%rsi

*** 2023-03-15 09:09:34.863
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x3, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=5cvg8vqsbr8j6) -----
BEGIN  
            UPDATE
             t_xifenfei 
             SET valid_param=:1 , comm_addr1 = :2 ,
                password=:3 , customer_type = :4 ,
                comm_no=:5 , date_grade_no = :6 ,
                date_grade_flag=:7 , curve_config_no = :8 ,
                curve_config_flag=:9 , baudrate = :10 ,
                meas_order=:11 , meter_no = :12 ,
                kind_no=:13 , tariff_count = :14 ,
                meter_digits = :15 , import_user=:16 ,
                readmeter_flag = :17 ,status_code=:18  
            WHERE meter_id = :19 
         ; 
…………
         ; 
            UPDATE
             t_xifenfei 
             SET valid_param=:94982 , comm_addr1 = :94983 ,
                password=:94984 , customer_type = :94985 ,
                comm_no=:94986 , date_grade_no = :94987 ,
                date_grade_flag=:94988 , curve_config_no = :94989 ,
                curve_config_flag=:94990 , baudrate = :94991 ,
                meas_order=:94992 , meter_no = :94993 ,
                kind_no=:94994 , tariff_count = :94995 ,
                meter_digits = :94996 , import_user=:94997 ,
                readmeter_flag = :94998 ,status_code=:94999  
            WHERE meter_id = :95000 
         ;END;

写了一个begin end,里面对于同一个update语句进行多次绑定变量实现批量提交功能,绑定变量的数量达到95000个,远超oracle官方限制的65535的极限,触发类似Bug 12578873 ORA-7445 [opiaba] when using more than 65535 bind variables
20230315132517


处理方法减少绑定变量数量,不能超过65535个

ORA-00800: soft external error, arguments: [Set Priority Failed]

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-00800: soft external error, arguments: [Set Priority Failed]

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在一套19.14的linux 2节点rac库中,使用sqlplus启动数据库成功,但是alert日志中报ORA-00800: soft external error, arguments: [Set Priority Failed]错误.

2022-09-21T22:20:35.924251+08:00
Starting background process VKTM
2022-09-21T22:20:35.977936+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_vktm_22653.trc  (incident=880052):
ORA-00800: soft external error, arguments: [Set Priority Failed], [VKTM],
 [Check traces and OS configuration], [Check Oracle document and MOS notes], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl1/incident/incdir_880052/orcl1_vktm_22653_i880052.trc
2022-09-21T22:20:35.980555+08:00
Error attempting to elevate VKTM's priority: no further priority changes will be attempted for this process
VKTM started with pid=6, OS id=22653

Starting background process LMHB
2022-09-21T22:20:36.467831+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_lms0_22703_22708.trc  (incident=920005):
ORA-00800: soft external error, arguments: [Set Priority Failed], [LMS0], 
[Check traces and OS configuration], [Check Oracle document and MOS notes], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl1/incident/incdir_920005/orcl1_lms0_22703_22708_i920005.trc
2022-09-21T22:20:36.470535+08:00
Error attempting to elevate LMS0's priority: no further priority changes will be attempted for this process

错误提示比较明显,提升进程的优先级失败,通过操作系统命令观察发现确实进程优先级没有提升

[root@oradb01 ~]# ps -eo pid,class,pri,nice,time,args|grep vktm|grep -v grep 
 5656 TS   19   0 00:00:00 ora_vktm_orcl1
30838 RR   41   - 13:08:36 ora_vktm_+ASM1

重新使用srvctl启动数据库,优先级提升正常,alert日志中也无类似警告

[root@oradb01 ~]# ps -eo pid,class,pri,nice,time,args|grep vktm|grep -v grep 
 5716 RR   41   0 00:00:00 ora_vktm_orcl1
30838 RR   41   - 13:18:46 ora_vktm_+ASM1

这个问题一直困惑了很久,今天无意中在mos上发现了相关mos文档,具体参考:(DB50) Clusterware Fails to Start Because CSSD Cannot Get Real-Time Priority (Doc ID 2903663.1),由于 bug 34286265 and bug 34318125(Bug 34649727 Linux: ORA-800 / Set Priority / DB Performance Merge Patch for 19.17 – 34286265 34318125)
20221121210544


尽量不要使用sqlplus去启动数据库,而是选择使用srvctl,避免在rac环境中导致数据库后台关键进程优先级无法提升问题.

ORA-00257: archiver error的另外一种原因

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-00257: archiver error的另外一种原因

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天遇到一个相对特殊的案例,拿出来和大家分享,数据库报错为不能归档ORA-00257

SQL> conn system/xxxxxx
ERROR:
ORA-00257: archiver error. Connect internal only, until freed.

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /oracle/PRD/oraarch
Oldest online log sequence     11479
Current log sequence           11481

alert日志报错

sapprddb1:oraprd 73> tail -f alert_prd.log
Master archival failure: 19502
Master archival failure: 19502
Master archival failure: 19502
Master archival failure: 19502
Wed Oct 12 09:27:41 2022
Master archival failure: 19502
Master archival failure: 19502
Master archival failure: 19502
Wed Oct 12 09:27:41 2022
Master archival failure: 19502

该错误的含义

sapprddb1:oraprd 74>oerr ora 19502
19502, 00000, "write error on file \"%s\", block number %s (block size=%s)"
// *Cause:  write error on output file
// *Action: check the file

从报错初步看是由于归档目录空间满了导致,查看发现归档空间剩余很多

sapprddb1:/oracle/PRD/sapdata1 # df -h
Filesystem                          Size  Used Avail Use% Mounted on
/dev/sda1                            99G   25G   70G  26% /
udev                                253G  240K  253G   1% /dev
tmpfs                               426G   72K  426G   1% /dev/shm
/dev/sda3                           388G   12G  357G   4% /backup
/dev/mapper/SAPVG-oraclelv          197G   22G  165G  12% /oracle
/dev/mapper/SAPVG-mirrlogA           20G  773M   18G   5% /oracle/PRD/mirrlogA
/dev/mapper/SAPVG-mirrlogBlv         20G  773M   18G   5% /oracle/PRD/mirrlogB
/dev/mapper/SAPVG-oraarchlv         180G  2.2G  169G   2% /oracle/PRD/oraarch    <----剩余很多
/dev/mapper/SAPVG-origlogAlv         20G  894M   18G   5% /oracle/PRD/origlogA
/dev/mapper/SAPVG-origlogBlv         20G  894M   18G   5% /oracle/PRD/origlogB
/dev/mapper/SAPVG-sapdata1lv        591G  561G     0 100% /oracle/PRD/sapdata1
/dev/mapper/SAPVG-sapdata2lv        1.4T  1.4T     0 100% /oracle/PRD/sapdata2
/dev/mapper/SAPVG-sapdata3lv        788G  748G     0 100% /oracle/PRD/sapdata3
/dev/mapper/SAPVG-sapdata4lv        788G  748G     0 100% /oracle/PRD/sapdata4
sapprd:/sapmnt/PRD                   30G   12G   17G  42% /sapmnt/PRD
sapprd:/usr/sap/trans               105G 1005M   99G   1% /usr/sap/trans
/dev/mapper/VGSAP2-lvsapdata5       1.5T  340G  1.1T  25% /oracle/PRD/sapdata5
/dev/mapper/VGSAP2-lvsapdata6       1.5T  706G  696G  51% /oracle/PRD/sapdata6

尝试人工归档

SQL> alter system archive log current
  2  /
alter system archive log current
*
ERROR at line 1:
ORA-19502: write error on file "/oracle/PRD/sapdata1/cntrl/cntrlPRD.dbf", block
number 4837 (block size=16384)
ORA-27061: waiting for async I/Os failed
Linux-x86_64 Error: 28: No space left on device
Additional information: -1
Additional information: 442368

到这里基本上明确了,由于控制文件所在的分区磁盘使用100%,导致归档的时候无法写如记录到控制文件从而导致数据库报ORA-00257错误,对/oracle/PRD/sapdata1中的某个文件进行稍微收缩,数据库恢复正常

ORA-600 3417故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 3417故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库突然报ORA-600 3417错误

Mon Sep 26 06:42:51 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7984.trc  (incident=176185):
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\incident\incdir_176185\orcl2_lgwr_7984_i176185.trc
Mon Sep 26 06:42:54 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7984.trc:
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
LGWR (ospid: 7984): terminating the instance due to error 470

节点2异常之后,节点1由于跨节点实例恢复导致异常

Mon Sep 26 06:44:26 2022
Instance recovery: looking for dead threads
Beginning instance recovery of 1 threads
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
 parallel recovery started with 31 processes
Started redo scan
Completed redo scan
 read 887 KB redo, 348 data blocks need recovery
Started redo application at
 Thread 2: logseq 9907, block 1980
Recovery of Online Redo Log: Thread 2 Group 3 Seq 9907 Reading mem 0
  Mem# 0: +DATA/orcl/onlinelog/group_3.265.1078882689
  Mem# 1: +OCR/orcl/onlinelog/group_3.259.1078882689
Completed redo application of 0.32MB
Completed instance recovery at
 Thread 2: logseq 9907, block 3755, scn 231951271
 338 data blocks read, 348 data blocks written, 887 redo k-bytes read
Mon Sep 26 06:44:36 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_smon_7972.trc  (incident=208205):
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
Mon Sep 26 06:44:38 2022
Reconfiguration started (old inc 14, new inc 16)
List of instances:
 1 2 (myinst: 1) 
 Global Resource Directory frozen
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Mon Sep 26 06:44:38 2022
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Sep 26 06:44:38 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Sep 26 06:44:38 2022
 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Sep 26 06:44:38 2022
 LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info 
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
ORA-600 occurred during recovery, instance will be terminated
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_smon_7972.trc:
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
System state dump requested by (instance=1,osid=7972 (SMON)),summary=[abnormal instance termination].
SMON (ospid: 7972): terminating the instance due to error 600
Mon Sep 26 06:44:43 2022
ORA-1092 : opitsk aborting process
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_diag_6956_20220926064442.trc:
ORA-00601: ??????
Mon Sep 26 06:44:46 2022
opiodr aborting process unknown ospid (6688) as a result of ORA-1092

再次重启实例无法正常启动报ORA-600 3417错误

Completed: ALTER DATABASE   MOUNT
Mon Sep 26 08:08:34 2022
alter database open
This instance was first to open
Picked broadcast on commit scheme to generate SCNs
Mon Sep 26 08:08:36 2022
LGWR: STARTING ARCH PROCESSES
Mon Sep 26 08:08:36 2022
ARC0 started with pid=39, OS id=5004 
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Mon Sep 26 08:08:37 2022
ARC1 started with pid=38, OS id=3568 
Mon Sep 26 08:08:37 2022
ARC2 started with pid=41, OS id=3308 
Mon Sep 26 08:08:37 2022
ARC3 started with pid=42, OS id=8180 
Mon Sep 26 08:08:37 2022
ARC4 started with pid=43, OS id=7768 
Mon Sep 26 08:08:37 2022
ARC5 started with pid=44, OS id=4628 
Mon Sep 26 08:08:37 2022
ARC6 started with pid=45, OS id=6920 
Mon Sep 26 08:08:37 2022
ARC7 started with pid=46, OS id=7960 
ARC1: Archival started
ARC2: Archival started
ARC3: Archival started
ARC4: Archival started
ARC5: Archival started
ARC6: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7924.trc(incident=400186):
ORA-00600: ??????, ??: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
ARC7: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Mon Sep 26 08:08:39 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7924.trc:
ORA-00600: ??????, ??: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
LGWR (ospid: 7924): terminating the instance due to error 470
Mon Sep 26 08:08:44 2022
ORA-1092 : opitsk aborting process

故障比较明显,数据库两个节点故障之后,启动其中一个节点,已经完成了实例恢复,但是无法open,通过10046进行跟踪发下你open过程没有执行任何语句直接导致win服务异常,然后终止
111


通过进一步分析确认是redo组异常

SQL> select group#,sequence#,status,thread# from v$log;

    GROUP#  SEQUENCE# STATUS              THREAD#
---------- ---------- ---------------- ----------
         1      10837 CURRENT                   1
         2          0 UNUSED                    1
         3       9907 CURRENT                   2
         4          0 UNUSED                    2

进一步查询数据文件是否正常

SQL> set pages 10000
SQL> set numw 16
SQL> SELECT status,
  2  checkpoint_change#,
  3  checkpoint_time,last_change#,
  4  count(*) ROW_NUM
  5  FROM v$datafile
  6  GROUP BY status, checkpoint_change#, checkpoint_time,last_change#
  7  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS  CHECKPOINT_CHANGE# CHECKPOINT_TIME         LAST_CHANGE#          ROW_NUM

------- ------------------ ------------------- ---------------- ----------------

ONLINE           231971314 2022-09-26 06:44:37        231971314               44

SYSTEM           231971314 2022-09-26 06:44:37        231971314                1



SQL> set numw 16
SQL> col CHECKPOINT_TIME for a40
SQL> set lines 150
SQL> set pages 1000
SQL> SELECT status,
  2  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,chec
kpoint_change#,
  3  count(*) ROW_NUM
  4  FROM v$datafile_header
  5  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh
24:mi:ss'),fuzzy
  6  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS  CHECKPOINT_TIME                          FUZ CHECKPOINT_CHANGE#
 ROW_NUM
------- ---------------------------------------- --- ------------------ --------
--------
ONLINE  2022-09-26 06:44:37                      NO           231971314
      45

基于上述情况,对于这个库,可以直接open库,实现数据0丢失

ORA-12518 ORA-27302: 错误发生在: ssthrddcr

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-12518 ORA-27302: 错误发生在: ssthrddcr

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

11.2.0.1数据库运行在win 2008机器上,进程数配置为1000,但是数据库连接数到了170多之后,就开始报ORA-12518 TNS:监听程序无法分发客户机连接
ORA-12518


检查alert日志发现后台日志报ORA-27300 ORA-27301 ORA-27302错误

Fri May 06 11:57:14 2022
Process startup failed, error stack:
Errors in file c:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_psp0_3052.trc:
ORA-27300: 操作系统系统相关操作: CreateThread 失败, 状态为: 8
ORA-27301: 操作系统故障消息: 存储空间不足,无法处理此命令。
ORA-27302: 错误发生在: ssthrddcr
Process J002 died, see its trace file
kkjcre1p: unable to spawn jobq slave process 
Errors in file c:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_cjq0_3808.trc:

通过mos分析确认问题为IRPStackSize参数值问题
irpstacksize


参考:Ora-27300: OS System Dependent Operation:Createthread Failed With Status: 8 (Doc ID1310453.1)

segment header异常对象删除处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:segment header异常对象删除处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

对于某些极端情况下,segment header出现损坏的对象该如何处理,这里通过一个实验来说明这类情况该如何处理,创建表并查询相关segment信息

SQL> create tablespace t_xff datafile '/u01/app/oracle/oradata/orcl/t_xifenfei01.dbf' size 128M autoextend on;

Tablespace created.

SQL> create table t_xifenfei tablespace t_xff as select * from dba_objects;

Table created.

SQL> select header_file,header_block from dba_SEGMENTS where segment_name='T_XIFENFEI'  and owner='SYS';

HEADER_FILE HEADER_BLOCK
----------- ------------
          5          130
----segment header 为 file 5,block 130

SQL> select   EXTENT_ID , FILE_ID ,BLOCK_ID  from dba_extents where segment_name='T_XIFENFEI'  and owner='SYS';

 EXTENT_ID    FILE_ID   BLOCK_ID
---------- ---------- ----------
         0          5        128
         1          5        136
         2          5        144
         3          5        152
         4          5        160
         5          5        168
         6          5        176
         7          5        184
         8          5        192
         9          5        200
        10          5        208
        11          5        216
        12          5        224
        13          5        232
        14          5        240
        15          5        248
        16          5        256
        17          5        384
        18          5        512
        19          5        640
        20          5        768
        21          5        896
        22          5       1024
        23          5       1152
        24          5       1280

25 rows selected.

---有25个extent信息

SQL> select * from dba_free_space where file_id=5;

TABLESPACE_NAME                   FILE_ID   BLOCK_ID      BYTES     BLOCKS
------------------------------ ---------- ---------- ---------- ----------
RELATIVE_FNO
------------
T_XFF                                   5       1408  122683392      14976
           5
--空闲block是从1408开始

模拟segment header损坏(通过dd破坏block)

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> 
SQL> 
SQL> 
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@xifenfei ~]$ 
[oracle@xifenfei ~]$ 
[oracle@xifenfei ~]$echo xifenfei.com|dd of=/u01/app/oracle/oradata/orcl/t_xifenfei01.dbf bs=8192 conv=notrunc seek=130
0+1 records in
0+1 records out
17 bytes (17 B) copied, 5.4389e-05 s, 313 kB/s
[oracle@xifenfei ~]$ dbv file=/u01/app/oracle/oradata/orcl/t_xifenfei01.dbf

DBVERIFY: Release 11.2.0.4.0 - Production on Wed Apr 13 20:29:41 2022

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/orcl/t_xifenfei01.dbf
Page 130 is marked corrupt
Corrupt block relative dba: 0x01400082 (file 5, block 130)
Bad header found during dbv: 
Data in bad block:
 type: 119 format: 7 rdba: 0x65666978
 last change scn: 0x632e.6965666e seq: 0x6f flg: 0x6d
 spare1: 0x77 spare2: 0x2e spare3: 0x0
 consistency value in tail: 0xada72301
 check value in block header: 0xf30a
 computed block checksum: 0x5eb9



DBVERIFY - Verification complete

Total Pages Examined         : 16384
Total Pages Processed (Data) : 1234
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 154
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 14995
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 6466979 (0.6466979)

查询表数据报错

[oracle@xifenfei ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Wed Apr 13 20:29:48 2022

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area  734892032 bytes
Fixed Size                  2256872 bytes
Variable Size             452984856 bytes
Database Buffers          276824064 bytes
Redo Buffers                2826240 bytes
Database mounted.
Database opened.
SQL> select count(1) from t_xifenfei;
select count(1) from t_xifenfei
                     *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 5, block # 130)
ORA-01110: data file 5: '/u01/app/oracle/oradata/orcl/t_xifenfei01.dbf'

查询extent信息为空

SQL> select   EXTENT_ID   , FILE_ID   ,BLOCK_ID  from dba_extents where segment_name='T_XIFENFEI'  and owner='SYS';

no rows selected

SQL> select header_file,header_block from dba_SEGMENTS where segment_name='T_XIFENFEI'  and owner='SYS';

HEADER_FILE HEADER_BLOCK
----------- ------------
          5          130

SQL> select * from dba_free_space where file_id=5;

TABLESPACE_NAME                   FILE_ID   BLOCK_ID      BYTES     BLOCKS
------------------------------ ---------- ---------- ---------- ----------
RELATIVE_FNO
------------
T_XFF                                   5       1408  122683392      14976
           5

尝试删除表报错

SQL> drop table t_xifenfei;
drop table t_xifenfei
           *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 5, block # 130)
ORA-01110: data file 5: '/u01/app/oracle/oradata/orcl/t_xifenfei01.dbf'

通过加purge删除成功

SQL> drop table t_xifenfei purge;

Table dropped.

删除表成功,但是磁盘空间未释放,通过查询确认变为一个临时段

SQL> select * from dba_free_space where file_id=5;

TABLESPACE_NAME                   FILE_ID   BLOCK_ID      BYTES     BLOCKS
------------------------------ ---------- ---------- ---------- ----------
RELATIVE_FNO
------------
T_XFF                                   5       1408  122683392      14976
           5


SQL> select segment_name,segment_type,owner from dba_segments where header_file=5;

SEGMENT_NAME
--------------------------------------------------------------------------------
SEGMENT_TYPE       OWNER
------------------ ------------------------------
5.130
TEMPORARY          SYS

清理临时段,彻底删除segment header异常对象删除后遗症

SQL> exec dbms_space_admin.segment_corrupt('T_XFF',5,130); 

PL/SQL procedure successfully completed.

SQL> exec dbms_space_admin.segment_drop_corrupt('T_XFF',5,130);

PL/SQL procedure successfully completed.

SQL> exec DBMS_SPACE_ADMIN.TABLESPACE_REBUILD_BITMAPS ('T_XFF');

PL/SQL procedure successfully completed.

SQL> select segment_name,segment_type,owner from dba_segments where header_file=5;

no rows selected

SQL> select * from dba_free_space where file_id=5;

TABLESPACE_NAME                   FILE_ID   BLOCK_ID      BYTES     BLOCKS
------------------------------ ---------- ---------- ---------- ----------
RELATIVE_FNO
------------
T_XFF                                   5        128  133169152      16256
           5