ORA-65088: database open should be retried

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-65088: database open should be retried

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在12.2以及后续的cdb版本中,如果重建ctl并且resetlogs库,很可能会遇到ORA-65088: database open should be retried错误

SQL> startup nomount force pfile='/<path>/<filename>.ora';
ORACLE instance started.

Total System Global Area 1593835520 bytes
Fixed Size                  8793256 bytes
Variable Size             402654040 bytes
Database Buffers         1174405120 bytes
Redo Buffers                7983104 bytes
SQL> !vi ctl.sql

SQL> @ctl.sql

Control file created.

SQL> select count(*) ,fhsta from x$kcvfh group by fhsta;

  COUNT(*)      FHSTA
---------- ----------
        11      32768
         4      40960

SQL> select count(*) ,FHSCN from x$kcvfh group by FHSCN;

  COUNT(*) FHSCN
---------- --------------------
         3 1820866
         4 2281969
         4 2281978
         4 2281982

SQL> select file#,error from v$datafile_header where length(error)>=1;

no rows selected

SQL> select count(*) ,fhrba_seq from x$kcvfh group by fhrba_seq;

  COUNT(*)  FHRBA_SEQ
---------- ----------
         3         20
        12         32

SQL> recover database using backup controlfile until cancel;
ORA-00279: change 2281978 generated at 09/19/2018 00:52:00 needed for thread 1
ORA-00289: suggestion : /<path>/1_32_981800889.dbf
ORA-00280: change 2281978 for thread 1 is in sequence #32


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/<path>/1_32_981800889.dbf
ORA-00279: change 2282008 generated at 09/19/2018 00:52:13 needed for thread 1
ORA-00289: suggestion : /<path>/1_33_981800889.dbf
ORA-00280: change 2282008 for thread 1 is in sequence #33
ORA-00278: log file '/<path>/1_32_981800889.dbf' no longer needed for this recovery


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.    << Expected message "Media recovery complete." !!
SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       MOUNTED
         3 _###_UNKNOWN_PDB_#_3           MOUNTED
         4 _###_UNKNOWN_PDB_#_4           MOUNTED
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-65088: database open should be retried
Process ID: 32688
Session ID: 10 Serial number: 38416

alert日志类似错误

 Dictionary check beginning
 Pluggable Database <pdb_name_1> (#3) found in data dictionary,
 but not in the control file. Adding it to control file.
 Pluggable Database <pdb_name_2> (#4) found in data dictionary,
 but not in the control file. Adding it to control file.
 Tablespace '<tablespace_name>' #3 found in data dictionary,
 but not in the controlfile. Adding to controlfile.
 --
 File 8 not verified due to error ORA-01122
 File 9 not verified due to error ORA-01122
 File 11 not verified due to error ORA-01122
 File 16 not verified due to error ORA-01122
 File 17 not verified due to error ORA-01122
 File 18 not verified due to error ORA-01122
 File 19 not verified due to error ORA-01122
 File 20 not verified due to error ORA-01122
  --
 ORA-65088: database open should be retried
 2018-09-19T01:00:54.083814+05:30
 Errors in file /<path>/trace/<oracle_sid>_ora_12412.trc:
 ORA-65088: database open should be retried
 Error 65088 happened during db open, shutting down database
 Errors in file /<path>/trace/<oracle_sid>_ora_12412.trc  (incident=12289) (PDBNAME=CDB$ROOT):
 ORA-00603: ORACLE server session terminated by fatal error
 ORA-01092: ORACLE instance terminated. Disconnection forced
 ORA-65088: database open should be retried

出现这类故障的原因是由于:
we see that the created controlfile is not aware of PDB and open resetlogs process trying to add information in newly created file . Hence, recovery process ,in newly created controlfile didn’t applied the archives to datafiles part of PDB which says later it will ask for recovery once controlfile is aware of PDB files During the resetlogs process, its pushing the required information to controlfile and shutting the database with suggestion to re-try opening the DB.

$ sqlplus "/as sysdba"

SQL*Plus: Release 12.2.0.1.0 Production on Wed Sep 19 01:34:01 2018

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount force pfile='/<path>/<filename>.ora';
ORACLE instance started.

Total System Global Area 1593835520 bytes
Fixed Size                  8793256 bytes
Variable Size             402654040 bytes
Database Buffers         1174405120 bytes
Redo Buffers                7983104 bytes
SQL> alter database mount;

Database altered.

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       MOUNTED
         3 PDB1                           MOUNTED
         4 APDB                           MOUNTED
        
SQL> select count(*) ,FHSCN from x$kcvfh group by FHSCN;

  COUNT(*) FHSCN
---------- --------------------
         3 1820866
         4 2281969
         4 2281982
         4 2282012        
        
//* Here , we see controlfile is aware of PDB



$ sqlplus "/as sysdba"

SQL*Plus: Release 12.2.0.1.0 Production on Wed Sep 19 01:02:13 2018

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Connected to an idle instance.

SQL>  startup nomount force pfile='/<path>/<filename>.ora';
ORACLE instance started.

Total System Global Area 1593835520 bytes
Fixed Size                  8793256 bytes
Variable Size             402654040 bytes
Database Buffers         1174405120 bytes
Redo Buffers                7983104 bytes
SQL> alter database mount;

Database altered.

SQL> recover database;
ORA-00279: change 2281969 generated at 09/19/2018 00:51:35 needed for thread 1
ORA-00289: suggestion : /<path>/1_32_981800889.dbf
ORA-00280: change 2281969 for thread 1 is in sequence #32


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/<path>/1_32_981800889.dbf
Log applied.
Media recovery complete.
SQL> alter database open;

Database altered.

SQL>

官方的进一步解释:
We clearly see that the recovery steps applies the same archivelog file twice. When a controlfile is recreated, the recovery initiated will apply archivelog files to only the CDB datafiles, not to the PDB. Once the database open returns the ORA-65088 error, the next database re-start will apply the archivelog files to the PDB for the sake of database consistency.This should explain why Oracle is looking to apply the same archivelog sequence a second time. The following bugs report similar issues. They have both been closed as ‘not a bug’ as this is expected behavior:
BUG 24951417 – ERROR OPENING DATABASE WITH RESETLOGS AFTER CREATE CONTROLFILE
BUG 25172530 – MULTITENANT RESTORE FAILED WITH ORA-65088: DATABASE OPEN
参考:ORA-65088 while opening DB with resetlogs for multi-tenant DB in 12.2 (Doc ID 2449591.1)

ORA-600 12807(CON$.CON#达到最大值) 处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 12807(CON$.CON#达到最大值) 处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

这次阳了有点严重,客户现场打patch无法去,在家里远程值守,在电脑前面闲着就查询和重现了最近朋友和我说的他们的客户遇到ORA-600 12807的故障.查询了下mos,基本上可以确认是由于CON$.CON#达到理论最大值无法继续增加从而报该错误,参考文档:
Mechanism to Recycle Database Constraint Identifiers (Doc ID 2925056.1)
Bug 13781691 – ORA-600 [12807] if CON$.CON# very high due to bug 13784384 (Doc ID 13781691.8)
Bug 25343563 – Mechanism to Implement Constraint Identifier (con#) Recycling (Doc ID 25343563.8)
在12及其之后的版本中oracle发布了patch 25343563 并设置event启用该patch进行解决.但是如果是12c之前版本,官方没有提供直接的解决方案.最基本的解决方法就是进行数据逻辑迁移,以及避免频繁创建约束导致con$.con#消耗太大

通过试验重现该错误

SQL> create table t_xff  (id number not null,name varchar2(100) not null);
create table t_xff  (id number not null,name varchar2(100) not null)
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [12807], [], [], [], [], [], [], [],
[], [], [], []

SQL>  create table t_xff  (id number,name varchar2(100));

Table created.

SQL> alter table t_xff add primary key(id);
alter table t_xff add primary key(id)
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [12807], [], [], [], [], [], [], [],
[], [], [], []


SQL> select con# from sys.con$ where name='_NEXT_CONSTRAINT';

      CON#
----------
4294967294

通过一些底层分析,并对部分底层基表进行patch实现数据库可以继续创建约束

SQL>  alter table t_xff add primary key(id);

Table altered.

使用临时的patch方法,可以快速的恢复业务,后续找适当时间点安排迁移.
在此提醒:对于一些创建中间对象或者临时对象频繁的系统(特别是大量主键,not null等)注意检查该值距离天花板距离,如果比较接近了最好安排一次逻辑迁移和找出来原因(是oracle bug还是应用触发)

ORA-600 ksuloget2 恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 ksuloget2 恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户在win 32位的操作系统上调至sga超过2G,数据库运行过程中报ORA-600 ksuloget2错误

Thread 1 cannot allocate new log, sequence 43586
Checkpoint not complete
  Current log# 1 seq# 43585 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Fri Aug 04 14:57:02 2023
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_42996.trc  (incident=67481):
ORA-00600: 内部错误代码, 参数: [ksuloget2], [0xFEBA6208], [0xFEBA3B08], [500], [0xFEBA622C], [], [], [], [], []
Thread 1 advanced to log sequence 43586 (LGWR switch)
  Current log# 2 seq# 43586 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG

重启数据库,进行尝试恢复继续报ORA-600 ksuloget2

Thu Aug 17 17:38:27 2023
ALTER DATABASE RECOVER  database using backup controlfile  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 24 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile  ...
Thu Aug 17 17:39:01 2023
ALTER DATABASE RECOVER LOGFILE 'D:\oracle\flash_recovery_area\orcl\ARCHIVELOG\2023_08_04\REDO03.LOG'  
Media Recovery Log D:\oracle\flash_recovery_area\orcl\ARCHIVELOG\2023_08_04\REDO03.LOG
Thu Aug 17 17:39:01 2023
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110724):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00342: archived log does not have expected resetlogs SCN 685171428
ORA-00334: archived log: 'D:\ORACLE\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2023_08_04\REDO03.LOG'
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc:
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00342: archived log does not have expected resetlogs SCN 685171428
ORA-00334: archived log: 'D:\ORACLE\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2023_08_04\REDO03.LOG'
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_5604.trc  (incident=110709):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00342: archived log does not have expected resetlogs SCN 685171428
ORA-00334: archived log: 'D:\ORACLE\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2023_08_04\REDO03.LOG'
Incident details in: d:\oracle\diag\rdbms\orcl\orcl\incident\incdir_110709\orcl_ora_5604_i110709.trc
ORA-600 signalled during:ALTER DATABASE RECOVER LOGFILE 'D:\oracle\flash_recovery_area\orcl\2023_08_04\REDO03.LOG'
ALTER DATABASE RECOVER CANCEL 
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110725):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110726):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc:
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110727):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110728):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_5604.trc  (incident=110710):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [
Incident details in: d:\oracle\diag\rdbms\orcl\orcl\incident\incdir_110710\orcl_ora_5604_i110710.trc

由于是应用日志失败,屏蔽日志一致性,强制打开数据库,检查数据ok,业务可以直接使用,对于这类问题,官方建议:ORA-600: [Ksuloget2] Hit on Windows When SGA Greater Than 1G (Doc ID 836109.1)
20230819105750


ORA-10485: Real-Time Query cannot be enabled while applying migration redo.

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-10485: Real-Time Query cannot be enabled while applying migration redo.

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

对于数据库打psu和jvm patch之后,dg备库同步会出现类似ORA-10485: Real-Time Query cannot be enabled while applying migration redo.异常

Tue Aug 15 18:48:18 2023
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Attempt to start background Managed Standby Recovery process (orcl)
Tue Aug 15 18:48:18 2023
MRP0 started with pid=33, OS id=15486
MRP0: Background Managed Standby Recovery process started (orcl)
 started logmerger process
Tue Aug 15 18:48:23 2023
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 80 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Recovery of Online Redo Log: Thread 1 Group 10 Seq 106115 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/orcl/std_redo10.log
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Tue Aug 15 18:49:50 2023
RFS[1]: Assigned to RFS process 16049
RFS[1]: Opened log for thread 1 sequence 106117 dbid 1490144467 branch 962363734
Tue Aug 15 18:49:50 2023
RFS[2]: Assigned to RFS process 16051
RFS[2]: Selected log 10 for thread 1 sequence 106115 dbid 1490144467 branch 962363734
Tue Aug 15 18:49:50 2023
RFS[3]: Assigned to RFS process 16053
RFS[3]: Opened log for thread 1 sequence 106116 dbid 1490144467 branch 962363734
Archived Log entry 106102 added for thread 1 sequence 106116 rlc 962363734 ID 0x58d223d3 dest 2:
RFS[3]: Opened log for thread 1 sequence 106118 dbid 1490144467 branch 962363734
RFS[2]: Opened log for thread 1 sequence 106119 dbid 1490144467 branch 962363734
Tue Aug 15 18:49:50 2023
Archived Log entry 106103 added for thread 1 sequence 106115 ID 0x58d223d3 dest 1:
Tue Aug 15 18:49:50 2023
Media Recovery Log /u01/app/oracle/fast_recovery_area/ORCLDG/archivelog/2023_08_15/o1_mf_1_106116_lfpp2ghc_.arc
Errors with log /u01/app/oracle/fast_recovery_area/ORCLDG/archivelog/2023_08_15/o1_mf_1_106116_lfpp2ghc_.arc
MRP0: Background Media Recovery terminated with error 10485
Errors in file /u01/app/oracle/diag/rdbms/orcldg/orcl/trace/orcl_pr00_15488.trc:
ORA-10485: Real-Time Query cannot be enabled while applying migration redo.
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Tue Aug 15 18:49:50 2023
MRP0: Background Media Recovery process shutdown (orcl)

这种情况,是由于数据库在upgrade模式下产生日志无法正常被adg实时应用,我一般是临时rman应用备库日志解决,官方解决方案:MRP process getting terminated with error ORA-10485 (Doc ID 1618485.1)

11g CASE

---------

1> Stop DG broker (if used)
   i.e., on primary and standby

SQL> alter system set dg_broker_start = false scope = both sid = '*' ;
2> Stop managed recovery in the standby, shutdown the standby and startup mount.
    start managed recovery without real time apply.

SQL> alter database recover managed standby database disconnect ;
3> Wait until all the redo is applied to the standby and the standby is in sync.
     Do a couple of log switches on the primary, all instances if RAC, and let them apply to the standby.

4> Shutdown the standby and startup mount
   start managed recovery with real time apply.

SQL> alter database recover managed standby database using current logfile disconnect ; 
5> Restart broker(if used).

on primary and standby

SQL> alter system set dg_broker_start = true scope = both sid = '*' ;
 

12c CASE

----------

In 12c and later, start Archived log apply using below command with ARCHIVED LOGFILE option:

SQL> alter database recover managed standby database using archived logfile disconnect;
Wait until all the redo is applied to the standby and the standby is in sync. 
  Do a couple of log switches on the primary, all instances if RAC, and let them apply to the standby.

To stop Redo Apply, Issue the following SQL statement:
SQL>  alter database recover managed standby database cancel; 

start managed recovery with real time apply, Issue the following SQL statement

SQL> alter database recover managed standby database disconnect;

ORA-600 kghstack_underflow_internal_2

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 kghstack_underflow_internal_2

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

aix平台运行11.2.0.4 rac,突然一个节点crash,lms2进程报ORA-600 kghstack_underflow_internal_2错误

Thu Aug 03 18:43:16 2023
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc  (incident=761244):
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/oracle/app/oracle/diag/rdbms/xff/xff2/incident/incdir_761244/xff2_lms2_2884404_i761244.trc
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc  (incident=761245):
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11AB5BBF0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/oracle/app/oracle/diag/rdbms/xff/xff2/incident/incdir_761245/xff2_lms2_2884404_i761245.trc
Thu Aug 03 18:43:19 2023
Dumping diagnostic data in directory=[cdmp_20230803184319], requested by (instance=2, osid=2884404 (LMS2)), summary=[incident=761245].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Thu Aug 03 18:43:23 2023
Sweep [inc][761245]: completed
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc:
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
Sweep [inc][761244]: completed
Sweep [inc2][761245]: completed
Sweep [inc2][761244]: completed
Thu Aug 03 18:43:29 2023
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc:
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
LMS2 (ospid: 2884404): terminating the instance due to error 484

分析trace文件中的Call Stack Trace信息

----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
skdstdst()+40        bl       0000000109B3EE38     000000000 ? 000000001 ?
                                                   000000003 ? 000000000 ?
                                                   000000000 ? 000000001 ?
                                                   000000003 ? 000000000 ?
ksedst1()+112        call     skdstdst()           1777D9901C4FD34D ?
                                                   4840284100000000 ?
                                                   FFFFFFFFFFECE20 ?
                                                   2A501377F67A7 ? 10A742204 ?
                                                   000000000 ? 1107486C0 ?
                                                   2050033FFFECE28 ?
ksedst()+40          call     ksedst1()            FFFFFFFFFFFE0002 ?
                                                   0000060F1 ? 000000001 ?
                                                   10A46AD18 ? 000000000 ?
                                                   000000000 ? 000002004 ?
                                                   000000001 ?
dbkedDefDump()+1516  call     ksedst()             000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 300000003 ?
ksedmp()+72          call     dbkedDefDump()       3107486C0 ? 110000A28 ?
                                                   FFFFFFFFFFED630 ? 1106ABC70 ?
                                                   100125778 ? FFFFFFFFFFED5B0 ?
                                                   FFFFFFFFFFEDA30 ? 1106ABC70 ?
ksfdmp()+100         call     ksedmp()             000000002 ? 000000000 ?
                                                   000000002 ? 10AF71A68 ?
                                                   10A0720F8 ? 000000000 ?
                                                   1108EC608 ? 1107486C0 ?
dbgexPhaseII()+1904  call     ksfdmp()             FFFFFFFFFFFE0002 ?
                                                   0000060F1 ? 000000002 ?
                                                   000000000 ? 000000002 ?
                                                   10A0720F0 ? 000000000 ?
                                                   001050005 ?
dbgexProcessError()  call     dbgexPhaseII()       1107486C0 ? 1108EFB28 ?
+1556                                              0000B9D9D ? 200000000 ?
                                                   FFFFFFFFFFEE548 ? 000000104 ?
                                                   FFFFFFFFFFEDBB0 ?
                                                   FB400000000 ?
dbgeExecuteForError  call     dbgexProcessError()  1107486C0 ? 1108EC608 ?
()+72                                              100000000 ? 000000000 ?
                                                   FFFFFFFFFFF29E0 ?
                                                   2840288000000012 ?
                                                   10013DA4C ? 1108EE350 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  000000002 ? 000000128 ?
2044                          ()                   FFFFFFFFFFFE0002 ?
                                                   215265335E5162 ?
                                                   3726000000000001 ?
                                                   10A46AD18 ? 10A46CB00 ?
                                                   FFFFFFFFFFF1D30 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   000000001 ? 10A46AD18 ?
68                                                 25800000000 ? 109E7A740 ?
                                                   000000000 ? 000000038 ?
                                                   FFFFFFFFFFF2800 ? 11AB1AC50 ?
kgeadse()+380        call     dbkePostKGE_kgsf()   900000000512C74 ?
                                                   9001000A008DAD0 ? 000000000 ?
                                                   9001000A008DAD0 ?
                                                   8000000FFFF2C40 ?
                                                   7000147E8F28C98 ? 400000008 ?
                                                   1100054A0 ?
kgerinv_internal()+  call     kgeadse()            7FFFFFFFFFFFFFFF ?
48                                                 FFFFFFFFFFFEF8FF ?
                                                   000000019 ? 110476528 ?
                                                   000000001 ? 000000017 ?
                                                   00000000B ? 000000000 ?
kgerinv()+48         call     kgerinv_internal()   FFFFFFFFFFFEF8FF ?
                                                   FFFFFFFFFFFFFFFF ?
                                                   FFFFFFFFFFFFFFFF ?
                                                   7FFFFFFFFFFFFFFF ?
                                                   1001648E0 ? FFFFFFFFFFF25E0 ?
                                                   1106ABC70 ? 11073B3C0 ?
kgeasnmierr()+72     call     kgerinv()            000000000 ? 215265335E5162 ?
                                                   372600383A0F5000 ?
                                                   000000004 ? 10A328F7C ?
                                                   FFFFFFFFFFF2898 ? 000000002 ?
                                                   0FFFFFFFF ?
kghstack_underflow_  call     kgeasnmierr()        11AB967A0 ? 000000000 ?
internal()+280                                     FFFFFFFFFFF2860 ? 100000001 ?
                                                   000000002 ? 11AB5BBF0 ?
                                                   000000000 ? 11AB96778 ?
kghstack_free()+716  call     kghstack_underflow_  10A328F7C ? 110A2FEC0 ?
                              internal()           000000004 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000080 ? 80000000000000 ?
ktudda()+912         call     kghstack_free()      11AB5BBF0 ? 7215265335E5162 ?
                                                   3726000000000008 ?
                                                   000000102 ? 109E747E0 ?
                                                   FFFFFFFFFFF2A90 ? 000000048 ?
                                                   28408880FFFFFFFF ?
kcbtdu()+1636        call     ktudda()             70001383A0F4014 ? 000000000 ?
                                                   1FE800000000 ? 07F7F7F7F ?
                                                   FFFFFFFF80808080 ?
                                                   000000000 ? 000000030 ?
                                                   FFFFFFFFFFF2B30 ?
kcbzdh()+3200        call     kcbtdu()             35900000359 ? 100000001 ?
                                                   000000001 ? 200000001 ?
                                                   000000001 ? 00000005D ?
                                                   200066665D20 ? 000000000 ?
kcbzpnd()+504        call     kcbzdh()             70001383F6D64B8 ? 000002004 ?
                                                   2107486C0 ? 10A74269E ?
                                                   1107486C0 ? FFFFFFFFFFF3B30 ?
                                                   FFFFFFFFFFF38E0 ? 000000000 ?
kcbdnb()+724         call     kcbzpnd()            10A74267C ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 0001CE860 ?
                                                   000000000 ? 000000000 ?
dbkedDefDump()+5528  call     kcbdnb()             200000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   1100224D0 ? 000000018 ?
                                                   110001366 ? 000000000 ?
ksedmp()+72          call     dbkedDefDump()       3107486C0 ? 110000A28 ?
                                                   FFFFFFFFFFF3FC0 ? 1106ABC70 ?
                                                   100125778 ? 000000000 ?
                                                   FFFFFFFFFFF3FB0 ? 1106ABC70 ?
ksfdmp()+100         call     ksedmp()             000000002 ? 000000000 ?
                                                   000000002 ? 10AF71A68 ?
                                                   10A0720F8 ? 000000000 ?
                                                   1109DE650 ? 1107486C0 ?
dbgexPhaseII()+1904  call     ksfdmp()             11074B65C ? 000000001 ?
                                                   000000002 ? 000000000 ?
                                                   000000002 ? 10A0720F0 ?
                                                   000000000 ? 001050005 ?
dbgexProcessError()  call     dbgexPhaseII()       1107486C0 ? 1109DC860 ?
+1556                                              0000B9D9C ? 200000000 ?
                                                   FFFFFFFFFFF4ED8 ? 000000082 ?
                                                   FFFFFFFFFFF4560 ?
                                                   88A4422A00000000 ?
dbgeExecuteForError  call     dbgexProcessError()  1107486C0 ? 1109DE650 ?
()+72                                              100000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   0DFFFFFFF ? 1109E0398 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  00000000A ? 000000000 ?
2044                          ()                   000000001 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFFB4E0 ? 000000000 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   000000000 ? FFFFFFFFFFF96B0 ?
68                                                 2580000000A ? 109E7A740 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFF9190 ? 11AB1AC50 ?
kgeadse()+380        call     dbkePostKGE_kgsf()   000000001 ? 000000008 ?
                                                   000000000 ? 10A30EA38 ?
                                                   110000C20 ? 700014771160D68 ?
                                                   700014772ADB3A8 ? 000000001 ?
kgerinv_internal()+  call     kgeadse()            000000003 ? 000000000 ?
48                                                 11074B65C ? 000000001 ?
                                                   000000000 ? FFFFFFFFFFF96B0 ?
                                                   00000000A ? 000000001 ?
kgerinv()+48         call     kgerinv_internal()   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
kgeasnmierr()+72     call     kgerinv()            000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFF92B0 ?
                                                   48102840FFFFA5B0 ?
                                                   11AB5BBB8 ? 11074D658 ?
kghstack_underflow_  call     kgeasnmierr()        022028200 ? 022202820 ?
internal()+280                                     11AB5BBB8 ? 100000001 ?
                                                   000000002 ? 11074D658 ?
                                                   0442C2394 ? 000002000 ?
kghstack_free()+716  call     kghstack_underflow_  FFFFFFFFFFF92B0 ?
                              internal()           FFFFFFFFFFF95B8 ?
                                                   FFFFFFFFFFF92B0 ? 000000001 ?
                                                   FFFFFFFFFFF92B0 ?
                                                   FFFFFFFFFFF95E8 ?
                                                   FFFFFFFFFFF95B8 ? 11074B650 ?
ktundo()+924         call     kghstack_free()      0DEADBEEF ? 11074D668 ?
                                                   11074B654 ? 300000000 ?
                                                   1FFFFB4E0 ? FFFFFFFFFFFB4E0 ?
                                                   FFFFFFFFFFF94C0 ?
                                                   FFFFFFFFFFF9470 ?
kturCRBackoutOneChg  call     ktundo()             19FFFFB5E0 ?
()+848                                             494CEDB3FFFF9E50 ?
                                                   FFFFFFFFFFF9E48 ? 000000000 ?
                                                   000000000 ? FFFFFFFFFFFA5B0 ?
                                                   100000000 ? FFFFFFFFFFFB4E0 ?
ktrgcm()+5816        call     kturCRBackoutOneChg  FFFFFFFFFFFA5B0 ?
                              ()                   19FFFFA440 ?
                                                   FFFFFFFFFFFA5B8 ? 000000000 ?
                                                   1FFFFA478 ? FFFFFFFFFFFB4E0 ?
                                                   000000000 ? 000000000 ?
ktrget3()+832        call     ktrgcm()             FFFFFFFFFFFAC80 ? 000000000 ?
                                                   000000000 ? 000000003 ?
                                                   058F7501F ? 000000001 ?
                                                   000000004 ? 000000003 ?
ktrget2()+104        call     ktrget3()            000000002 ? 700000000014488 ?
                                                   7000147E9C41A50 ? 000000022 ?
                                                   110A123A0 ? 000000000 ?
                                                   FFFFFFFFFFFB080 ? 110A123B8 ?
kclgeneratecr()+654  call     ktrget2()            FFFFFFFFFFFB4D0 ? 110AA1610 ?
0                                                  14F11E4E00 ? 0F11E4E00 ?
                                                   357FED028 ? 000030000 ?
                                                   7000147E9C41A50 ?
                                                   700000000014488 ?
kclgcr()+812         call     kclgeneratecr()      11A209508 ? FFFFFFFFFFFBFC0 ?
                                                   FFFFFFFFFFFBC18 ? 000000000 ?
                                                   0FFFFBB10 ? 01A275AC8 ?
                                                   1761D7F302ED25AC ?
                                                   20000011A275AC8 ?
kclcrrf()+536        call     kclgcr()             FFFFFFFFFFFBC20 ?
                                                   FFFFFFFFFFFBD00 ? 101F5080C ?
                                                   000000000 ? 0000003E8 ?
                                                   000000028 ? 0000000C8 ?
                                                   FFFFFFFFFFFBF88 ?
kjblcrcbk()+896      call     kclcrrf()            000000001 ? 000000000 ?
                                                   7000147EB0F07B8 ?
                                                   7000147576C4471 ?
                                                   401472C30C7F0 ?
                                                   7000147576C4408 ?
                                                   7000147576C3190 ?
                                                   7000147576C7170 ?
kjblpcr()+304        call     kjblcrcbk()          FFFFFFFFFFFBDA8 ? 000000038 ?
                                                   7000147FABBDB48 ? 600000006 ?
                                                   000000016 ? 11A209468 ?
                                                   000000013 ? 0001C2153 ?
kjbmpbast()+1792     call     kjblpcr()            000000012 ? 000000168 ?
                                                   000000002 ? 70001109FDB8148 ?
                                                   357000000000357 ?
                                                   7000144F31F7750 ?
                                                   895000000000895 ? 000000000 ?
kjmxmpm()+760        call     kjbmpbast()          1000000000000 ? 80000001E ?
                                                   000000000 ? 11A2951C8 ?
                                                   C000000000 ? 000000000 ?
                                                   1000000000000 ? 000000000 ?
kjmpbmsg()+3508      call     kjmxmpm()            000000000 ? 11A3769E0 ?
                                                   FFFFFFFFFFFC380 ? 06DBFBAEF ?
                                                   101E13820 ? 11A3769E0 ?
                                                   7000147E339AE08 ?
                                                   FFFFFFFFFFFC210 ?
kjmsm()+13416        call     kjmpbmsg()           11A209448 ? 7000147E339AE08 ?
                                                   100000019 ? 100000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 7000000000168FD ?
ksbrdp()+2216        call     kjmsm()              7000000000168E0 ?
                                                   7000000000168FC ? 048244028 ?
                                                   000000E00 ? 1108B69F0 ?
                                                   100637768 ? 000000001 ?
                                                   700000007 ?
opirip()+1620        call     ksbrdp()             FFFFFFFFFFFFE22 ? 10AFA5FC8 ?
                                                   FFFFFFFFFFFDC10 ? 000000000 ?
                                                   000000001 ? 000000000 ?
                                                   01380038F ? 000000001 ?
opidrv()+608         call     opirip()             10AFA23B0 ? 410134118 ?
                                                   FFFFFFFFFFFED80 ?
                                                   2F7530312F ? 108A7E8C4 ?
                                                   1106ABC70 ?
                                                   652F70726F647563 ?
                                                   1106ABC70 ?
sou2o()+136          call     opidrv()             3208A885B0 ? 400000000 ?
                                                   FFFFFFFFFFFED80 ?
                                                   23001801CD0000 ? 000000010 ?
                                                   1106ABC70 ? 000000000 ?
                                                   000000000 ?
opimai_real()+188    call     sou2o()              FFFFFFFFFFFEDF0 ?
                                                   4424444B00000001 ?
                                                   9000000000D73CC ?
                                                   BADC0FFEE0DDF00D ?
                                                   000000003 ? 9001000A008DAD0 ?
                                                   A0000000A000000 ? 10B6A8F30 ?
ssthrdmain()+276     call     opimai_real()        9001000A0011A60 ?
                                                   FFFFFFFFFFFF148 ?
                                                   FFFFFFFFFFFEEF0 ? 10B6E9280 ?
                                                   90000000008582C ?
                                                   9001000A008DAD0 ?
                                                   FFFFFFFFFFFEED0 ?
                                                   9001000A008DAD0 ?
main()+204           call     ssthrdmain()         3F0003660 ? FFFFFFFFFFFF238 ?
                                                   FFFFFFFFFFFF2A0 ?
                                                   9FFFFFFF000D658 ?
                                                   9FFFFFFF00009A0 ? 000000000 ?
                                                   000000000 ? 9FFFFFFF000D658 ?
__start()+112        call     main()               000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
 

--------------------- Binary Stack Dump ---------------------

查询mos对比相关信息,参考: LMON or LMS Process Crashes Instance With ORA-600 [kghstack_underflow_internal_2] (Doc ID 2003278.1)信息

The LMON or LMS process crash the instance with an error like:

ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x110A10838], [], [], [], [], [], [], [], [], [], []

ORA-1092 : opitsk aborting process
Instance terminated by LMS1, pid = 14024818
 

Review of the generated tracefiles reveals a call stack similar to:

... kghstack_underflow_internal kghstack_free kccgrd kjxgrf_rr_read kjxgrDD_rr_read kjxgrimember kjxggpoll kjfmact kjfdact kjfcln ksbrdp ...
  
- OR -
  
... kghstack_underflow_internal kghstack_free ktundo kturcrbackoutonechg ktrgcm ktrget3 ktrget2 kclgcr ...

确认为Bug 18687067 – ORA-600 [KGHSTACK_UNDERFLOW_INTERNAL_2] closed as duplicate of Bug 20675347 – ORA-07445 [KGHSTACK_OVERFLOW_INTERNAL()+644](The bug is caused by an AIX compiler issue causing volatile variables in the Oracle kernel not to be handled properly.),解决方案升级数据库到12.1及其以上版本或者打上patch 20675347

ORA-07445 opiaba—绑定变量超过65535导致实例crash

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-07445 opiaba—绑定变量超过65535导致实例crash

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库异常报ORA-07445 opiaba,ORA-00600 17147错,导致实例crash

Wed Mar 15 09:48:06 2023
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x187B953, opiaba()+639] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_ora_169909.trc  (incident=446531):
ORA-07445: 出现异常错误: 核心转储 [opiaba()+639] [SIGSEGV] [ADDR:0x0] [PC:0x187B953] [SI_KERNEL(general_protection)] []
Incident details in: /u01/app/oracle/diag/rdbms/xifenfei/xifenfei3/incident/incdir_446531/xifenfei3_ora_169909_i446531.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Mar 15 09:48:07 2023
Dumping diagnostic data in directory=[cdmp_20230315094807], requested by (instance=3, osid=169909), summary=[incident=446531].
Wed Mar 15 09:48:08 2023
Sweep [inc][446531]: completed
Sweep [inc2][446531]: completed
Wed Mar 15 09:48:33 2023
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_pmon_161557.trc  (incident=440035):
ORA-00600: internal error code, arguments: [17147], [0x4AFC25D0C8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/xifenfei/xifenfei3/incident/incdir_440035/xifenfei3_pmon_161557_i440035.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Mar 15 09:48:33 2023
Dumping diagnostic data in directory=[cdmp_20230315094833], requested by (instance=3, osid=161557 (PMON)), summary=[incident=440035].
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_pmon_161557.trc:
ORA-00600: internal error code, arguments: [17147], [0x4AFC25D0C8], [], [], [], [], [], [], [], [], [], []
PMON (ospid: 161557): terminating the instance due to error 472
Wed Mar 15 09:48:34 2023
opiodr aborting process unknown ospid (170089) as a result of ORA-1092

对应的trace文件信息

*** 2023-03-15 09:09:34.862
*** SESSION ID:(1858.63187) 2023-03-15 09:09:34.862
*** CLIENT ID:() 2023-03-15 09:09:34.862
*** SERVICE NAME:(xifenfei) 2023-03-15 09:09:34.862
*** MODULE NAME:(JDBC Thin Client) 2023-03-15 09:09:34.862
*** ACTION NAME:() 2023-03-15 09:09:34.862
 
Dump continued from file: /u01/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_ora_116886.trc
ORA-07445: 出现异常错误: 核心转储 [opiaba()+639] [SIGSEGV] [ADDR:0x0] [PC:0x187B953] [SI_KERNEL(general_protection)] []

========= Dump for incident 326049 (ORA 7445 [opiaba()+639]) ========
----- Beginning of Customized Incident Dump(s) -----
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x187B953, opiaba()+639] [flags: 0x0, count: 1]
Registers:
%rax: 0x00000046fa212588 %rbx: 0x00000046762ee798 %rcx: 0x0000000000003200
%rdx: 0x0000000000000000 %rdi: 0x0000000000007340 %rsi: 0x0000000000007340
%rsp: 0x00007fffffff41d0 %rbp: 0x00007fffffff4240  %r8: 0x00000044a8599f00
 %r9: 0x0000000000000099 %r10: 0x0000000000000b34 %r11: 0x00000046fa211720
%r12: 0x00000046fa5c7488 %r13: 0x0000000000000000 %r14: 0x0200000046fa2125
%r15: 0x0000000000000005 %rip: 0x000000000187b953 %efl: 0x0000000000010202
  opiaba()+625 (0x187b945) jmp 0x187b982
  opiaba()+627 (0x187b947) movzbl 0x2c(%rbx),%r15d
  opiaba()+632 (0x187b94c) mov %r14,-0x60(%rbp)
  opiaba()+636 (0x187b950) mov %rax,%r14
> opiaba()+639 (0x187b953) movswq 0xa(%r14),%rcx
  opiaba()+644 (0x187b958) cmp %ecx,%r15d
  opiaba()+647 (0x187b95b) jne 0x187b976
  opiaba()+649 (0x187b95d) mov 0x18(%rbx),%rdi
  opiaba()+653 (0x187b961) lea 0xc(%r14),%rsi

*** 2023-03-15 09:09:34.863
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x3, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=5cvg8vqsbr8j6) -----
BEGIN  
            UPDATE
             t_xifenfei 
             SET valid_param=:1 , comm_addr1 = :2 ,
                password=:3 , customer_type = :4 ,
                comm_no=:5 , date_grade_no = :6 ,
                date_grade_flag=:7 , curve_config_no = :8 ,
                curve_config_flag=:9 , baudrate = :10 ,
                meas_order=:11 , meter_no = :12 ,
                kind_no=:13 , tariff_count = :14 ,
                meter_digits = :15 , import_user=:16 ,
                readmeter_flag = :17 ,status_code=:18  
            WHERE meter_id = :19 
         ; 
…………
         ; 
            UPDATE
             t_xifenfei 
             SET valid_param=:94982 , comm_addr1 = :94983 ,
                password=:94984 , customer_type = :94985 ,
                comm_no=:94986 , date_grade_no = :94987 ,
                date_grade_flag=:94988 , curve_config_no = :94989 ,
                curve_config_flag=:94990 , baudrate = :94991 ,
                meas_order=:94992 , meter_no = :94993 ,
                kind_no=:94994 , tariff_count = :94995 ,
                meter_digits = :94996 , import_user=:94997 ,
                readmeter_flag = :94998 ,status_code=:94999  
            WHERE meter_id = :95000 
         ;END;

写了一个begin end,里面对于同一个update语句进行多次绑定变量实现批量提交功能,绑定变量的数量达到95000个,远超oracle官方限制的65535的极限,触发类似Bug 12578873 ORA-7445 [opiaba] when using more than 65535 bind variables
20230315132517


处理方法减少绑定变量数量,不能超过65535个

ORA-00800: soft external error, arguments: [Set Priority Failed]

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-00800: soft external error, arguments: [Set Priority Failed]

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在一套19.14的linux 2节点rac库中,使用sqlplus启动数据库成功,但是alert日志中报ORA-00800: soft external error, arguments: [Set Priority Failed]错误.

2022-09-21T22:20:35.924251+08:00
Starting background process VKTM
2022-09-21T22:20:35.977936+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_vktm_22653.trc  (incident=880052):
ORA-00800: soft external error, arguments: [Set Priority Failed], [VKTM],
 [Check traces and OS configuration], [Check Oracle document and MOS notes], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl1/incident/incdir_880052/orcl1_vktm_22653_i880052.trc
2022-09-21T22:20:35.980555+08:00
Error attempting to elevate VKTM's priority: no further priority changes will be attempted for this process
VKTM started with pid=6, OS id=22653

Starting background process LMHB
2022-09-21T22:20:36.467831+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_lms0_22703_22708.trc  (incident=920005):
ORA-00800: soft external error, arguments: [Set Priority Failed], [LMS0], 
[Check traces and OS configuration], [Check Oracle document and MOS notes], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl1/incident/incdir_920005/orcl1_lms0_22703_22708_i920005.trc
2022-09-21T22:20:36.470535+08:00
Error attempting to elevate LMS0's priority: no further priority changes will be attempted for this process

错误提示比较明显,提升进程的优先级失败,通过操作系统命令观察发现确实进程优先级没有提升

[root@oradb01 ~]# ps -eo pid,class,pri,nice,time,args|grep vktm|grep -v grep 
 5656 TS   19   0 00:00:00 ora_vktm_orcl1
30838 RR   41   - 13:08:36 ora_vktm_+ASM1

重新使用srvctl启动数据库,优先级提升正常,alert日志中也无类似警告

[root@oradb01 ~]# ps -eo pid,class,pri,nice,time,args|grep vktm|grep -v grep 
 5716 RR   41   0 00:00:00 ora_vktm_orcl1
30838 RR   41   - 13:18:46 ora_vktm_+ASM1

这个问题一直困惑了很久,今天无意中在mos上发现了相关mos文档,具体参考:(DB50) Clusterware Fails to Start Because CSSD Cannot Get Real-Time Priority (Doc ID 2903663.1),由于 bug 34286265 and bug 34318125(Bug 34649727 Linux: ORA-800 / Set Priority / DB Performance Merge Patch for 19.17 – 34286265 34318125)
20221121210544


尽量不要使用sqlplus去启动数据库,而是选择使用srvctl,避免在rac环境中导致数据库后台关键进程优先级无法提升问题.

ORA-00257: archiver error的另外一种原因

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-00257: archiver error的另外一种原因

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天遇到一个相对特殊的案例,拿出来和大家分享,数据库报错为不能归档ORA-00257

SQL> conn system/xxxxxx
ERROR:
ORA-00257: archiver error. Connect internal only, until freed.

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /oracle/PRD/oraarch
Oldest online log sequence     11479
Current log sequence           11481

alert日志报错

sapprddb1:oraprd 73> tail -f alert_prd.log
Master archival failure: 19502
Master archival failure: 19502
Master archival failure: 19502
Master archival failure: 19502
Wed Oct 12 09:27:41 2022
Master archival failure: 19502
Master archival failure: 19502
Master archival failure: 19502
Wed Oct 12 09:27:41 2022
Master archival failure: 19502

该错误的含义

sapprddb1:oraprd 74>oerr ora 19502
19502, 00000, "write error on file \"%s\", block number %s (block size=%s)"
// *Cause:  write error on output file
// *Action: check the file

从报错初步看是由于归档目录空间满了导致,查看发现归档空间剩余很多

sapprddb1:/oracle/PRD/sapdata1 # df -h
Filesystem                          Size  Used Avail Use% Mounted on
/dev/sda1                            99G   25G   70G  26% /
udev                                253G  240K  253G   1% /dev
tmpfs                               426G   72K  426G   1% /dev/shm
/dev/sda3                           388G   12G  357G   4% /backup
/dev/mapper/SAPVG-oraclelv          197G   22G  165G  12% /oracle
/dev/mapper/SAPVG-mirrlogA           20G  773M   18G   5% /oracle/PRD/mirrlogA
/dev/mapper/SAPVG-mirrlogBlv         20G  773M   18G   5% /oracle/PRD/mirrlogB
/dev/mapper/SAPVG-oraarchlv         180G  2.2G  169G   2% /oracle/PRD/oraarch    <----剩余很多
/dev/mapper/SAPVG-origlogAlv         20G  894M   18G   5% /oracle/PRD/origlogA
/dev/mapper/SAPVG-origlogBlv         20G  894M   18G   5% /oracle/PRD/origlogB
/dev/mapper/SAPVG-sapdata1lv        591G  561G     0 100% /oracle/PRD/sapdata1
/dev/mapper/SAPVG-sapdata2lv        1.4T  1.4T     0 100% /oracle/PRD/sapdata2
/dev/mapper/SAPVG-sapdata3lv        788G  748G     0 100% /oracle/PRD/sapdata3
/dev/mapper/SAPVG-sapdata4lv        788G  748G     0 100% /oracle/PRD/sapdata4
sapprd:/sapmnt/PRD                   30G   12G   17G  42% /sapmnt/PRD
sapprd:/usr/sap/trans               105G 1005M   99G   1% /usr/sap/trans
/dev/mapper/VGSAP2-lvsapdata5       1.5T  340G  1.1T  25% /oracle/PRD/sapdata5
/dev/mapper/VGSAP2-lvsapdata6       1.5T  706G  696G  51% /oracle/PRD/sapdata6

尝试人工归档

SQL> alter system archive log current
  2  /
alter system archive log current
*
ERROR at line 1:
ORA-19502: write error on file "/oracle/PRD/sapdata1/cntrl/cntrlPRD.dbf", block
number 4837 (block size=16384)
ORA-27061: waiting for async I/Os failed
Linux-x86_64 Error: 28: No space left on device
Additional information: -1
Additional information: 442368

到这里基本上明确了,由于控制文件所在的分区磁盘使用100%,导致归档的时候无法写如记录到控制文件从而导致数据库报ORA-00257错误,对/oracle/PRD/sapdata1中的某个文件进行稍微收缩,数据库恢复正常

ORA-600 3417故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 3417故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库突然报ORA-600 3417错误

Mon Sep 26 06:42:51 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7984.trc  (incident=176185):
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\incident\incdir_176185\orcl2_lgwr_7984_i176185.trc
Mon Sep 26 06:42:54 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7984.trc:
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
LGWR (ospid: 7984): terminating the instance due to error 470

节点2异常之后,节点1由于跨节点实例恢复导致异常

Mon Sep 26 06:44:26 2022
Instance recovery: looking for dead threads
Beginning instance recovery of 1 threads
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
 parallel recovery started with 31 processes
Started redo scan
Completed redo scan
 read 887 KB redo, 348 data blocks need recovery
Started redo application at
 Thread 2: logseq 9907, block 1980
Recovery of Online Redo Log: Thread 2 Group 3 Seq 9907 Reading mem 0
  Mem# 0: +DATA/orcl/onlinelog/group_3.265.1078882689
  Mem# 1: +OCR/orcl/onlinelog/group_3.259.1078882689
Completed redo application of 0.32MB
Completed instance recovery at
 Thread 2: logseq 9907, block 3755, scn 231951271
 338 data blocks read, 348 data blocks written, 887 redo k-bytes read
Mon Sep 26 06:44:36 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_smon_7972.trc  (incident=208205):
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
Mon Sep 26 06:44:38 2022
Reconfiguration started (old inc 14, new inc 16)
List of instances:
 1 2 (myinst: 1) 
 Global Resource Directory frozen
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Mon Sep 26 06:44:38 2022
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Sep 26 06:44:38 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Sep 26 06:44:38 2022
 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Sep 26 06:44:38 2022
 LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info 
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
ORA-600 occurred during recovery, instance will be terminated
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_smon_7972.trc:
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
System state dump requested by (instance=1,osid=7972 (SMON)),summary=[abnormal instance termination].
SMON (ospid: 7972): terminating the instance due to error 600
Mon Sep 26 06:44:43 2022
ORA-1092 : opitsk aborting process
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_diag_6956_20220926064442.trc:
ORA-00601: ??????
Mon Sep 26 06:44:46 2022
opiodr aborting process unknown ospid (6688) as a result of ORA-1092

再次重启实例无法正常启动报ORA-600 3417错误

Completed: ALTER DATABASE   MOUNT
Mon Sep 26 08:08:34 2022
alter database open
This instance was first to open
Picked broadcast on commit scheme to generate SCNs
Mon Sep 26 08:08:36 2022
LGWR: STARTING ARCH PROCESSES
Mon Sep 26 08:08:36 2022
ARC0 started with pid=39, OS id=5004 
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Mon Sep 26 08:08:37 2022
ARC1 started with pid=38, OS id=3568 
Mon Sep 26 08:08:37 2022
ARC2 started with pid=41, OS id=3308 
Mon Sep 26 08:08:37 2022
ARC3 started with pid=42, OS id=8180 
Mon Sep 26 08:08:37 2022
ARC4 started with pid=43, OS id=7768 
Mon Sep 26 08:08:37 2022
ARC5 started with pid=44, OS id=4628 
Mon Sep 26 08:08:37 2022
ARC6 started with pid=45, OS id=6920 
Mon Sep 26 08:08:37 2022
ARC7 started with pid=46, OS id=7960 
ARC1: Archival started
ARC2: Archival started
ARC3: Archival started
ARC4: Archival started
ARC5: Archival started
ARC6: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7924.trc(incident=400186):
ORA-00600: ??????, ??: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
ARC7: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Mon Sep 26 08:08:39 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7924.trc:
ORA-00600: ??????, ??: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
LGWR (ospid: 7924): terminating the instance due to error 470
Mon Sep 26 08:08:44 2022
ORA-1092 : opitsk aborting process

故障比较明显,数据库两个节点故障之后,启动其中一个节点,已经完成了实例恢复,但是无法open,通过10046进行跟踪发下你open过程没有执行任何语句直接导致win服务异常,然后终止
111


通过进一步分析确认是redo组异常

SQL> select group#,sequence#,status,thread# from v$log;

    GROUP#  SEQUENCE# STATUS              THREAD#
---------- ---------- ---------------- ----------
         1      10837 CURRENT                   1
         2          0 UNUSED                    1
         3       9907 CURRENT                   2
         4          0 UNUSED                    2

进一步查询数据文件是否正常

SQL> set pages 10000
SQL> set numw 16
SQL> SELECT status,
  2  checkpoint_change#,
  3  checkpoint_time,last_change#,
  4  count(*) ROW_NUM
  5  FROM v$datafile
  6  GROUP BY status, checkpoint_change#, checkpoint_time,last_change#
  7  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS  CHECKPOINT_CHANGE# CHECKPOINT_TIME         LAST_CHANGE#          ROW_NUM

------- ------------------ ------------------- ---------------- ----------------

ONLINE           231971314 2022-09-26 06:44:37        231971314               44

SYSTEM           231971314 2022-09-26 06:44:37        231971314                1



SQL> set numw 16
SQL> col CHECKPOINT_TIME for a40
SQL> set lines 150
SQL> set pages 1000
SQL> SELECT status,
  2  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,chec
kpoint_change#,
  3  count(*) ROW_NUM
  4  FROM v$datafile_header
  5  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh
24:mi:ss'),fuzzy
  6  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS  CHECKPOINT_TIME                          FUZ CHECKPOINT_CHANGE#
 ROW_NUM
------- ---------------------------------------- --- ------------------ --------
--------
ONLINE  2022-09-26 06:44:37                      NO           231971314
      45

基于上述情况,对于这个库,可以直接open库,实现数据0丢失

ORA-12518 ORA-27302: 错误发生在: ssthrddcr

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-12518 ORA-27302: 错误发生在: ssthrddcr

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

11.2.0.1数据库运行在win 2008机器上,进程数配置为1000,但是数据库连接数到了170多之后,就开始报ORA-12518 TNS:监听程序无法分发客户机连接
ORA-12518


检查alert日志发现后台日志报ORA-27300 ORA-27301 ORA-27302错误

Fri May 06 11:57:14 2022
Process startup failed, error stack:
Errors in file c:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_psp0_3052.trc:
ORA-27300: 操作系统系统相关操作: CreateThread 失败, 状态为: 8
ORA-27301: 操作系统故障消息: 存储空间不足,无法处理此命令。
ORA-27302: 错误发生在: ssthrddcr
Process J002 died, see its trace file
kkjcre1p: unable to spawn jobq slave process 
Errors in file c:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_cjq0_3808.trc:

通过mos分析确认问题为IRPStackSize参数值问题
irpstacksize


参考:Ora-27300: OS System Dependent Operation:Createthread Failed With Status: 8 (Doc ID1310453.1)