ORA-600 ktubko_1 恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 ktubko_1 恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

oracle 12.2的rac库,pdb在open成功之后,没过多久会自动crash掉,主要报错ORA-600 ktubko_1

2022-10-08T16:00:17.874444+08:00
XFF(5):Endian type of dictionary set to little
2022-10-08T16:00:18.602483+08:00
XFF(5):[218515] Successfully onlined Undo Tablespace 26.
XFF(5):Undo initialization finished serial:0 start:73483625 end:73484200 diff:575 ms (0.6 seconds)
XFF(5):Database Characterset for XFF is ZHS16GBK
2022-10-08T16:00:19.340271+08:00
Buffer Cache Full DB Caching mode changing from FULL CACHING ENABLED to FULL CACHING DISABLED 
Full DB Caching disabled: DEFAULT_CACHE_SIZE should be at least 1394670 MBs bigger than current size. 
2022-10-08T16:00:21.308122+08:00
XFF(5):Opening pdb with no Resource Manager plan active
2022-10-08T16:00:22.655433+08:00
Pluggable database XFF opened read write
Completed:  ALTER PLUGGABLE DATABASE ALL OPEN
2022-10-08T16:00:36.419719+08:00
XFF(5):Setting Resource Manager plan SCHEDULER[0x4AC8]:DEFAULT_MAINTENANCE_PLAN via scheduler window
XFF(5):Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
2022-10-08T16:00:57.054295+08:00
XFF(5):minact-scn: got error during useg scan e:1555 usn:57
XFF(5):minact-scn: useg scan erroring out with error e:1555
2022-10-08T16:01:41.527943+08:00
Errors in file /u01/app/db/diag/rdbms/orcl/orcl1/trace/orcl1_smon_218039.trc  (incident=737693) (PDBNAME=XFF):
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
XFF(5):Incident details in: /u01/app/db/diag/rdbms/orcl/orcl1/incident/incdir_737693/orcl1_smon_218039_i737693.trc
XFF(5):Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2022-10-08T16:01:41.530481+08:00
XFF(5):*****************************************************************
XFF(5):An internal routine has requested a dump of selected redo.
XFF(5):This usually happens following a specific internal error, when
XFF(5):analysis of the redo logs will help Oracle Support with the
XFF(5):diagnosis.
XFF(5):It is recommended that you retain all the redo logs generated (by
XFF(5):all the instances) during the past 12 hours, in case additional
XFF(5):redo dumps are required to help with the diagnosis.
XFF(5):*****************************************************************
2022-10-08T16:01:42.611317+08:00
XFF(5):*****************************************************************
XFF(5):An internal routine has requested a dump of selected redo.
XFF(5):This usually happens following a specific internal error, when
XFF(5):analysis of the redo logs will help Oracle Support with the
XFF(5):diagnosis.
XFF(5):It is recommended that you retain all the redo logs generated (by
XFF(5):all the instances) during the past 12 hours, in case additional
XFF(5):redo dumps are required to help with the diagnosis.
XFF(5):*****************************************************************
XFF(5):ORACLE Instance orcl1 (pid = 44) - Error 600 encountered while recovering transaction (12, 1) on object 50.
2022-10-08T16:01:42.611961+08:00
XFF(5):Errors in file /u01/app/db/diag/rdbms/orcl/orcl1/trace/orcl1_smon_218039.trc:
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
2022-10-08T16:01:42.849438+08:00
Errors in file /u01/app/db/diag/rdbms/orcl/orcl1/trace/orcl1_smon_218039.trc  (incident=737694) (PDBNAME=XFF):
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
XFF(5):Incident details in: /u01/app/db/diag/rdbms/orcl/orcl1/incident/incdir_737694/orcl1_smon_218039_i737694.trc
…………
2022-10-08T16:01:55.212368+08:00
Instance Critical Process (pid: 44, ospid: 218039, SMON) died unexpectedly
PMON (ospid: 217933): terminating the instance due to error 474
2022-10-08T16:01:55.379857+08:00
System state dump requested by (instance=1, osid=217933 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/db/diag/rdbms/orcl/orcl1/trace/orcl1_diag_217966_20221008160155.trc
2022-10-08T16:01:56.417514+08:00
ORA-1092 : opitsk aborting process

因为有smon报的ORACLE Instance orcl1 (pid = 44) – Error 600 encountered while recovering transaction (12, 1) on object xxx这种比较明显错误,基本上可以定位是undo问题.对undo异常事务进行处理,数据库顺利open,并且稳定不再crash,然后对异常对象进行处理(当然也可以逻辑迁移)
20221008220645


在oracle 12.2到18.14的rac环境的cdb库中,如果节点sga大小不一致,而且有一个节点sga大于128G,就可能出现该问题,敬请注意
20221008220914

Bug 32347014: ORA-600[4506], ORA-600[KTUBKO_1] OCCUR AND INSTANCE CRASHES

一键修改Oracle SCN工具升级(patch scn)

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:一键修改Oracle SCN工具升级(patch scn)

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

前段时间开发了基于内存修改的patch scn小程序(修改oracle scn小工具(patch scn)),用来修改oracle数据库scn,实现快速调整oracle scn的功能(特别是在一些恢复的时候很有用),最近对其功能进一步完善,扩展了通过修改oracle controlfile实现调整scn功能,简单测试如下
数据库当前scn情况

C:\Users\XFF>sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on 星期四 10月 6 21:50:32 2022

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> set num 16
SQL> select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
------------------
      141743308712

SQL> shutdown immediate;
数据库已经关闭。
已经卸载数据库。
ORACLE 例程已经关闭。

通过工具一键修改oracle scn
20221006220814


启动库查看修改之后scn值

SQL> startup mount;
ORACLE 例程已经启动。

Total System Global Area 6413680640 bytes
Fixed Size                  2293216 bytes
Variable Size            1275068960 bytes
Database Buffers         5117050880 bytes
Redo Buffers               19267584 bytes
数据库装载完毕。
SQL> set num 16
SQL> select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
------------------
      141743339676

SQL> alter database open;

数据库已更改。

SQL> select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
------------------
      151743339602

数据库启动过程中scn会被改变,在mount的时候未发生改变
Patch_SCN下载:Patch_SCN下载
Patch_SCN使用说明:Patch_SCN使用说明
OraRecovery下载:OraRecovery下载
OraRecovery使用说明:OraRecovery使用说明

.mkp和.Elbie勒索加密数据库可恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:.mkp和.Elbie勒索加密数据库可恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

最近有朋友咨询了两种win机器文件加密的oracle数据库,通过判断均可修复然后正常open库
.DBF.[5D00A5FE].[Xats@privatemail.com].mkp,可以实现数据文件数据0丢失,和强制拉库效果一样
20221006090624
.DBF.id[10D73871-3400].[hero77@cock.li].Elbie,每个文件丢失数据小于1M
20221006090802
以上两种勒索加密都可以通过Oracle数据文件勒索加密恢复工具,快速open
20221006091325
对于类似这种被加密的勒索的数据文件(.[Xats@privatemail.com].mkp和.[hero77@cock.li].Elbie),我们可以实现比较好的恢复效果,如果此类的数据库(oracle,mysql,sql server)等被加密,需要专业恢复技术支持,请联系我们:
电话/微信:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com
系统安全防护措施建议:
1.多台机器,不要使用相同的账号和口令
2.登录口令要有足够的长度和复杂性,并定期更换登录口令
3.重要资料的共享文件夹应设置访问权限控制,并进行定期备份
4.定期检测系统和软件中的安全漏洞,及时打上补丁。
5.定期到服务器检查是否存在异常。
6.安装安全防护软件,并确保其正常运行。
7.从正规渠道下载安装软件。
8.对不熟悉的软件,如果已经被杀毒软件拦截查杀,不要添加信任继续运行。
9.保存良好的备份习惯,尽量做到每日备份,异地备份。

存储强制拉lun导致数据库异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:存储强制拉lun导致数据库异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户那边反馈有几个lun异常,无法正常online
20221004165647


通过存储工程师强制拉起来lun(清除掉了cache),但是数据库无法正常mount

Fri Sep 30 17:22:57 BEIST 2022
ALTER DATABASE   MOUNT
Fri Sep 30 17:22:57 BEIST 2022
This instance was first to mount
Fri Sep 30 17:22:58 BEIST 2022
Starting background process ASMB
ASMB started with pid=25, OS id=12976304
Starting background process RBAL
RBAL started with pid=26, OS id=12779520
Fri Sep 30 17:23:02 BEIST 2022
SUCCESS: diskgroup DATA was mounted
Fri Sep 30 17:23:06 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/udump/xifenfei2_ora_14549110.trc:
ORA-00600: internal error code, arguments: [kccpb_sanity_check_2], [423012], [422765], [0x000000000], [], [], [], []
Fri Sep 30 17:23:07 BEIST 2022
ORA-600 signalled during: ALTER DATABASE   MOUNT...
Fri Sep 30 17:23:07 BEIST 2022
Trace dumping is performing id=[cdmp_20220930172307]
Fri Sep 30 17:23:09 BEIST 2022
Shutting down instance (abort)
License high water mark = 1
Instance terminated by USER, pid = 9175148

可以要求保护第一现场,把asm中的数据恢复到文件系统中,然后进行恢复,由于客户是10g的环境,无法直接使用asmcmd中的cp实现此项操作,数据库也没有mount成功(无法使用rman的copy),考虑使用oracle的amdu实现此项操作需求.在拷贝过程中报AMDU-00204报错

root@xifenfei2:/recover/amduo#./amdu -diskstring '/dev/rhdiskpower*' -extract data.298 -noreport
amdu_2022_10_01_14_25_31/
AMDU-00204: file not found; arguments: [3] [DATA]
LEM-00031: Error encountered in lempgmh after calling lmserr.

通过dbv校验拷贝出来的数据文件

racle@xifenfei2:/recover#dbv file=/recover/amduo/amdu_2022_10_01_14_25_31/DATA_298.f

DBVERIFY: Release 10.2.0.5.0 - Production on Sat Oct 1 14:36:50 2022

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

DBVERIFY - Verification starting : FILE = /recover/amduo/amdu_2022_10_01_14_33_26/DATA_298.f


DBVERIFY - Verification complete

Total Pages Examined         : 262144
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 262143
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 121293100 (0.121293100)

确认此报错(AMDU-00204 LEM-00031)对于拷贝出来的数据文件无直接影响,可以忽略,拷贝出来所有文件进行重建ctl,报ORA-01159错误

SQL> CREATE CONTROLFILE REUSE DATABASE "xifenfei" NORESETLOGS  NOARCHIVELOG  
  2      MAXLOGFILES 50  
    MAXLOGMEMBERS 5  
    MAXDATAFILES 100  
    MAXINSTANCES 8  
    MAXLOGHISTORY 226  
LOGFILE  
group 5 '/recover/df/DATA_262.f'  size 200M ,
group 2 '/recover/df/DATA_266.f'  size 200M ,
group 1 '/recover/df/DATA_267.f'  size 200M ,
group 3 '/recover/df/DATA_281.f'  size 200M ,
group 4 '/recover/df/DATA_282.f'  size 200M ,
group 6 '/recover/df/DATA_283.f'  size 200M
DATAFILE  
'/recover/df/DATA_295.f',
'/recover/df/DATA_298.f',
'/recover/df/DATA_272.f',
'/recover/df/DATA_273.f',
'/recover/df/DATA_296.f',
'/recover/df/DATA_274.f',
'/recover/df/DATA_276.f',
'/recover/df/DATA_277.f',
'/recover/df/DATA_275.f',
'/recover/df/DATA_279.f',
'/recover/df/DATA_278.f',
'/recover/df/DATA_288.f',
'/recover/df/DATA_269.f',
'/recover/df/DATA_300.f',
'/recover/df/DATA_264.f',
'/recover/df/DATA_287.f',
'/recover/df/DATA_280.f',
'/recover/df/DATA_286.f',
'/recover/df/DATA_268.f',
'/recover/df/DATA_285.f',
'/recover/df/DATA_297.f'
CHARACTER SET UTF8  
;  
 37  CREATE CONTROLFILE REUSE DATABASE "xifenfei" NORESETLOGS  NOARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-01159: file is not from same database as previous files - wrong database id
ORA-01110: data file 3: '/recover/df/DATA_288.f'

由于部分文件不是该库的,通过进一步分析,除掉不是该库的文件,重建ctl文件成功.尝试recover数据库,报大量ORA-07445错误,由于cache丢失redo损坏导致,此类操作可能导致数据文件损坏【恢复需要谨慎,最好对数据文件做一次备份】

Sat Oct 01 16:23:16 BEIST 2022
ALTER DATABASE RECOVER  database  
Media Recovery Start
 parallel recovery started with 15 processes
Sat Oct 01 16:23:16 BEIST 2022
Recovery of Online Redo Log: Thread 1 Group 5 Seq 2202 Reading mem 0
  Mem# 0: /recover/df/DATA_262.f
Sat Oct 01 16:23:16 BEIST 2022
Recovery of Online Redo Log: Thread 2 Group 6 Seq 2394 Reading mem 0
  Mem# 0: /recover/df/DATA_283.f
Sat Oct 01 16:23:28 BEIST 2022
Recovery of Online Redo Log: Thread 2 Group 3 Seq 2395 Reading mem 0
  Mem# 0: /recover/df/DATA_281.f
Sat Oct 01 16:23:34 BEIST 2022
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2203 Reading mem 0
  Mem# 0: /recover/df/DATA_267.f
Sat Oct 01 16:23:35 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p009_13107402.trc:
ORA-07445: exception encountered: core dump [kcbzfc+00dc] [SIGSEGV] 
Sat Oct 01 16:23:35 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p014_7929944.trc:
ORA-07445: exception encountered: core dump [kcbzfc+00dc] [SIGSEGV]
Sat Oct 01 16:23:35 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p011_10092678.trc:
ORA-07445: exception encountered: core dump [kcbzfc+00dc] [SIGSEGV] 
Sat Oct 01 16:23:35 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_dbw1_12189898.trc:
ORA-07445: exception encountered: core dump [kcbzdh+0324] [SIGSEGV] 
Sat Oct 01 16:23:35 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p005_14549014.trc:
ORA-07445: exception encountered: core dump [kcbbufaddr2hdr+00d8] [SIGSEGV] 
Sat Oct 01 16:23:35 BEIST 2022
Hex dump of (file 6, block 10) in trace file /home/oracle/admin/xifenfei/bdump/xifenfei2_dbw0_13565988.trc
Corrupt block relative dba: 0x0180000a (file 6, block 10)
Bad header found during buffer corrupt after write
Data in bad block:
 type: 2 format: 1 rdba: 0x00000180
 last change scn: 0xa0c3.000a6eeb seq: 0x0 flg: 0x00
 spare1: 0x2 spare2: 0xa2 spare3: 0x3486
 consistency value in tail: 0x0000a0c3
 check value in block header: 0x204
 block checksum disabled
Reread of rdba: 0x0180000a (file 6, block 10) found different data
Sat Oct 01 16:23:35 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_dbw0_13565988.trc:
ORA-07445: exception encountered: core dump [kcbbiop+01b8] [SIGSEGV] [Invalid permissions for mapped object] 
Sat Oct 01 16:23:36 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p014_7929944.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbzfc+00dc] [SIGSEGV] [Address not mapped to object]
Sat Oct 01 16:23:36 BEIST 2022
Trace dumping is performing id=[cdmp_20221001162336]
Sat Oct 01 16:23:37 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p011_10092678.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbzfc+00dc] [SIGSEGV] [Address not mapped to object] 
Sat Oct 01 16:23:37 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p009_13107402.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbzfc+00dc] [SIGSEGV] [Address not mapped to object]
Sat Oct 01 16:23:37 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_dbw1_12189898.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbzdh+0324] [SIGSEGV] [Address not mapped to object]
Sat Oct 01 16:23:37 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p005_14549014.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbbufaddr2hdr+00d8] [SIGSEGV] [Address not mapped to object] 
Sat Oct 01 16:23:37 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_dbw0_13565988.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbbiop+01b8] [SIGSEGV] [Invalid permissions for mapped object]
Sat Oct 01 16:23:37 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p005_14549014.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbbufaddr2hdr+00d8] [SIGSEGV] [Address not mapped to object] 
Sat Oct 01 16:23:38 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p010_9502918.trc:
ORA-00600: internal error code, arguments: [3020], [3], [31913], [2], [2395], [210418], [16], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 31913)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: '/recover/df/DATA_278.f'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Sat Oct 01 16:23:39 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p010_9502918.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-00600: internal error code, arguments: [3020], [3], [31913], [2], [2395], [210418], [16], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 31913)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: '/recover/df/DATA_278.f'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Sat Oct 01 16:23:39 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_pmon_14418166.trc:
ORA-00471: DBWR process terminated with error
Sat Oct 01 16:23:39 BEIST 2022
PMON: terminating instance due to error 471
Sat Oct 01 16:23:40 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p010_9502918.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-00600: internal error code, arguments: [3020], [3], [31913], [2], [2395], [210418], [16], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 31913)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: '/recover/df/DATA_278.f'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Sat Oct 01 16:23:46 BEIST 2022
Dump system state for local instance only
System State dumped to trace file /home/oracle/admin/xifenfei/bdump/xifenfei2_diag_15401212.trc
Sat Oct 01 16:23:46 BEIST 2022
Trace dumping is performing id=[cdmp_20221001162346]
Sat Oct 01 16:23:49 BEIST 2022
Instance terminated by PMON, pid = 14418166

绕过redo,直接强制启动库,报ORA-01092错误

QL> startup mount pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area      10737418240 bytes
Fixed Size                        2114208 bytes
Variable Size                  1560284512 bytes
Database Buffers               9160359936 bytes
Redo Buffers                     14659584 bytes
Database mounted.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced

分析alert日志,确认是由于undo异常导致

Additional information: 3
Sat Oct 01 17:25:21 BEIST 2022
Setting recovery target incarnation to 2
Sat Oct 01 17:25:21 BEIST 2022
Assigning activation ID 1094862311 (0x414245e7)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /recover/df/DATA_267.f1
Successful open of redo thread 1
Sat Oct 01 17:25:21 BEIST 2022
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Oct 01 17:25:21 BEIST 2022
SMON: enabling cache recovery
Sat Oct 01 17:25:23 BEIST 2022
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0000.6ee0bde5):
Sat Oct 01 17:25:23 BEIST 2022
select ctime, mtime, stime from obj$ where obj# = :1
Sat Oct 01 17:25:23 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/udump/xifenfei2_ora_19726450.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 8 with name "_SYSSMU8$" too small
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 19726450
ORA-1092 signalled during: alter database open resetlogs...

类似此类错误的解决方案,以前写过参考:
在数据库open过程中常遇到ORA-01555汇总
数据库open过程遭遇ORA-1555对应sql语句补充
Oracle Recovery Tools恢复—ORA-00704 ORA-01555故障
使用_allow_resetlogs_corruption导致ORA-00704/ORA-01555故障
解决该问题,数据库启动正常,逻辑导出数据,导入数据完成此次恢复任务,实现绝大部分数据恢复

ORA-00316 ORA-00312故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-00316 ORA-00312故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报ORA-00316,ORA-00312,无法正常启动
ORA-00316


通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)分析,确认是当前redo损坏
20221003154221

对于这种情况,只能是屏蔽一致性,强制拉库,结果在拉库过程中报ORA-600 2662错误
ORA-600 2662

这个错误相对比较简单,修改下相关scn即可,数据库open成功

SQL> startup nomount pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area 3.8482E+10 bytes
Fixed Size                  2261368 bytes
Variable Size            8187285128 bytes
Database Buffers         3.0199E+10 bytes
Redo Buffers               93593600 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE "xifenfei" NORESETLOGS  NOARCHIVELOG
  2          MAXLOGFILES 50
  3         MAXLOGMEMBERS 5
  4         MAXDATAFILES 1000
  5         MAXINSTANCES 8
  6         MAXLOGHISTORY 2920
  7  LOGFILE
  8         group 1   '/u01/oracle/app/oradata/xifenfei/redo01.log' size 500M,
  9         group 2   '/u01/oracle/app/oradata/xifenfei/redo02.log' size 500M,
 10         group 6   '/u01/oracle/app/oradata/xifenfei/redo06.log' size 500M,
 11         group 4   '/u01/oracle/app/oradata/xifenfei/redo04.log' size 500M,
 12         group 5   '/u01/oracle/app/oradata/xifenfei/redo05.log' size 500M,
 13         group 3   '/u01/oracle/app/oradata/xifenfei/redo03.log' size 500M
 14  DATAFILE
 15          '/u01/oracle/app/oradata/xifenfei/system01.dbf',
 16          '/u01/oracle/app/oradata/xifenfei/sysaux01.dbf',
 17          '/u01/oracle/app/oradata/xifenfei/undotbs01.dbf',
 18          '/u01/oracle/app/oradata/xifenfei/users01.dbf',
………………
 49          '/u01/oracle/app/oradata/xifenfei/XIFENFEI.dbf'
 50  CHARACTER SET  ZHS16GBK ;

Control file created.

SQL> recover database;
ORA-10877: error signaled in parallel recovery slave


SQL> recover database until cancel;
ORA-00279: change 2290050101 generated at 09/30/2022 23:18:22 needed for thread
1
ORA-00289: suggestion : /u02/oracle/arch/1_2_1116803861.dbf
ORA-00280: change 2290050101 for thread 1 is in sequence #2


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/app/oradata/xifenfei/system01.dbf'


ORA-01112: media recovery not started


SQL> alter database open resetlogs;

Database altered.

检查数据库字典一致性

SQL> @1
HCheck Version 07MAY18 on 01-OCT-2022 01:07:48
----------------------------------------------
Catalog Version 11.2.0.4.0 (1102000400)
db_name: XIFENFEI

                                   Catalog       Fixed
Procedure Name                     Version    Vs Release    Timestamp
Result
------------------------------ ... ---------- -- ---------- --------------
------
.- LobNotInObj                 ... 1102000400 <=  *All Rel* 10/01 01:07:48 PASS
.- MissingOIDOnObjCol          ... 1102000400 <=  *All Rel* 10/01 01:07:48 PASS
.- SourceNotInObj              ... 1102000400 <=  *All Rel* 10/01 01:07:48 PASS
.- OversizedFiles              ... 1102000400 <=  *All Rel* 10/01 01:07:48 PASS
.- PoorDefaultStorage          ... 1102000400 <=  *All Rel* 10/01 01:07:48 PASS
.- PoorStorage                 ... 1102000400 <=  *All Rel* 10/01 01:07:48 PASS
.- TabPartCountMismatch        ... 1102000400 <=  *All Rel* 10/01 01:07:48 PASS
.- OrphanedTabComPart          ... 1102000400 <=  *All Rel* 10/01 01:07:48 PASS
.- MissingSum$                 ... 1102000400 <=  *All Rel* 10/01 01:07:48 PASS
.- MissingDir$                 ... 1102000400 <=  *All Rel* 10/01 01:07:48 PASS
.- DuplicateDataobj            ... 1102000400 <=  *All Rel* 10/01 01:07:48 PASS
.- ObjSynMissing               ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- ObjSeqMissing               ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- OrphanedUndo                ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- OrphanedIndex               ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- OrphanedIndexPartition      ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- OrphanedIndexSubPartition   ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- OrphanedTable               ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- OrphanedTablePartition      ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- OrphanedTableSubPartition   ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- MissingPartCol              ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- OrphanedSeg$                ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- OrphanedIndPartObj#         ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- DuplicateBlockUse           ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- FetUet                      ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- Uet0Check                   ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- SeglessUET                  ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- BadInd$                     ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- BadTab$                     ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- BadIcolDepCnt               ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- ObjIndDobj                  ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- TrgAfterUpgrade             ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- ObjType0                    ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- BadOwner                    ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- StmtAuditOnCommit           ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- BadPublicObjects            ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- BadSegFreelist              ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- BadDepends                  ... 1102000400 <=  *All Rel* 10/01 01:07:49 PASS
.- CheckDual                   ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
.- ObjectNames                 ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
.- BadCboHiLo                  ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
.- ChkIotTs                    ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
.- NoSegmentIndex              ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
.- BadNextObject               ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
.- DroppedROTS                 ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
.- FilBlkZero                  ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
.- DbmsSchemaCopy              ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
.- OrphanedObjError            ... 1102000400 >  1102000000 10/01 01:07:50 PASS
.- ObjNotLob                   ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
.- MaxControlfSeq              ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
.- SegNotInDeferredStg         ... 1102000400 >  1102000000 10/01 01:07:50 PASS
.- SystemNotRfile1             ... 1102000400 >   902000000 10/01 01:07:50 PASS
.- DictOwnNonDefaultSYSTEM     ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
.- OrphanTrigger               ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
.- ObjNotTrigger               ... 1102000400 <=  *All Rel* 10/01 01:07:50 PASS
---------------------------------------
01-OCT-2022 01:07:50  Elapsed: 2 secs
---------------------------------------
Found 0 potential problem(s) and 0 warning(s)

PL/SQL procedure successfully completed.

Statement processed.

数据库字典本身没有大问题,但是为了排除潜在风险,建议逻辑迁移到新库

asm disk被分区,格式化为ext4恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:asm disk被分区,格式化为ext4恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户因为没有认识到linux中的磁盘被asm使用,对其进行分区并且做成了ext4的文件系统,从history中获取客户操作命令

  600  fdisk -l
  601  fdisk /dev/sdb
  602  mkfs ext4 /dev/sdb1 
  603  fdisk -l
  604  mkfs -t ext4 /dev/sdb1 
  605  cd /
  606  mkdir u01
  607  mount /dev/sdb1 /u01
  608  df -h

确认磁盘情况,确认sdb直接被asm磁盘使用(asmdisk1)

[grid@racdb3 trace]$ ls -l /dev/asm*
brw-rw---- 1 grid asmadmin 8, 16 Sep 30 14:34 /dev/asmdisk1
[grid@racdb3 trace]$ ls -l /dev/sd*
brw-rw---- 1 root disk 8,  0 Jul 27  2021 /dev/sda
brw-rw---- 1 root disk 8,  1 Jul 27  2021 /dev/sda1
brw-rw---- 1 root disk 8,  2 Jul 27  2021 /dev/sda2
brw-rw---- 1 root disk 8, 16 Sep 30 11:23 /dev/sdb
brw-rw---- 1 root disk 8, 17 Sep 30 11:23 /dev/sdb1
brw-rw---- 1 root disk 8, 32 Jul 27  2021 /dev/sdc

asm日志报错

Fri Sep 30 11:31:41 2022
NOTE: SMON starting instance recovery for group DATA domain 1 (mounted)
NOTE: SMON skipping disk 0 - no header
NOTE: cache initiating offline of disk 0 group DATA
NOTE: process _smon_+asm3 (2989) initiating offline of disk 0.3915953109 (DATA_0000) with mask 0x7e in group 1
NOTE: initiating PST update: grp = 1, dsk = 0/0xe968b3d5, mask = 0x6a, op = clear
Fri Sep 30 11:31:41 2022
GMON updating disk modes for group 1 at 4 for pid 17, osid 2989
ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 1)
Fri Sep 30 11:31:41 2022
NOTE: cache dismounting (not clean) group 1/0x34F84324 (DATA) 
WARNING: Offline for disk DATA_0000 in mode 0x7f failed.
Fri Sep 30 11:31:41 2022
NOTE: halting all I/Os to diskgroup 1 (DATA)
ERROR: No disks with F1X0 found on disk group DATA
NOTE: aborting instance recovery of domain 1 due to diskgroup dismount
NOTE: SMON skipping lock domain (1) validation because diskgroup being dismounted

数据库日志报错

Fri Sep 30 11:31:44 2022
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_lmon_26356.trc:
ORA-00202: control file: '+DATA/xifenfei/controlfile/current.256.968794097'
ORA-15078: ASM diskgroup was forcibly dismounted
Fri Sep 30 11:31:45 2022
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_ckpt_26388.trc:
ORA-00206: error in writing (block 5, # blocks 1) of control file
ORA-00202: control file: '+DATA/xifenfei/controlfile/current.257.968794097'
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-00206: error in writing (block 5, # blocks 1) of control file
ORA-00202: control file: '+DATA/xifenfei/controlfile/current.256.968794097'
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-15078: ASM diskgroup was forcibly dismounted
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_ckpt_26388.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 5, # blocks 1) of control file
ORA-00202: control file: '+DATA/xifenfei/controlfile/current.257.968794097'
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-00206: error in writing (block 5, # blocks 1) of control file
ORA-00202: control file: '+DATA/xifenfei/controlfile/current.256.968794097'
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-15078: ASM diskgroup was forcibly dismounted
CKPT (ospid: 26388): terminating the instance due to error 221

通过kfed 查看asm disk被破坏情况

[root@racdb3 scsi_host]#  kfed read /dev/asmdisk1 
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
7F4FAAD45400 00000000 00000000 00000000 00000000  [................]
        Repeat 26 times
7F4FAAD455B0 00000000 00000000 45C222C8 01000000  [.........".E....]
7F4FAAD455C0 FE830001 003FFFFF E9D60000 0000FFFF  [......?.........]
7F4FAAD455D0 00000000 00000000 00000000 00000000  [................]
        Repeat 1 times
7F4FAAD455F0 00000000 00000000 00000000 AA550000  [..............U.]
7F4FAAD45600 00000000 00000000 00000000 00000000  [................]
  Repeat 223 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

[root@racdb3 scsi_host]#  kfed read /dev/asmdisk1  aun=2
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
7F64E77A0400 00000000 00000000 00000000 00000000  [................]
        Repeat 223 times
7F64E77A1200 000081F9 000181F9 000281F9 000381F9  [................]
7F64E77A1210 000481F9 000C81F9 000D81F9 001881F9  [................]
7F64E77A1220 002881F9 003E81F9 007981F9 00AB81F9  [..(...>...y.....]
7F64E77A1230 013881F9 016C81F9 044581F9 04B081F9  [..8...l...E.....]
7F64E77A1240 061A81F9 0CD081F9 1E8481F9 00000000  [................]
7F64E77A1250 00000000 00000000 00000000 00000000  [................]
  Repeat 26 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

[root@racdb3 scsi_host]#  kfed read /dev/asmdisk1  aun=3
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
7F8D101FF400 00000000 00000000 00000000 00000000  [................]
        Repeat 223 times
7F8D10200200 000082F9 000182F9 000282F9 000382F9  [................]
7F8D10200210 000482F9 000C82F9 000D82F9 001882F9  [................]
7F8D10200220 002882F9 003E82F9 007982F9 00AB82F9  [..(...>...y.....]
7F8D10200230 013882F9 016C82F9 044582F9 04B082F9  [..8...l...E.....]
7F8D10200240 061A82F9 0CD082F9 1E8482F9 00000000  [................]
7F8D10200250 00000000 00000000 00000000 00000000  [................]
  Repeat 26 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

[root@racdb3 scsi_host]#  kfed read /dev/asmdisk1  aun=4
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
7F142949C400 00000000 00000000 00000000 00000000  [................]
        Repeat 223 times
7F142949D200 000083F9 000183F9 000283F9 000383F9  [................]
7F142949D210 000483F9 000C83F9 000D83F9 001883F9  [................]
7F142949D220 002883F9 003E83F9 007983F9 00AB83F9  [..(...>...y.....]
7F142949D230 013883F9 016C83F9 044583F9 04B083F9  [..8...l...E.....]
7F142949D240 061A83F9 0CD083F9 1E8483F9 00000000  [................]
7F142949D250 00000000 00000000 00000000 00000000  [................]
  Repeat 26 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

[root@racdb3 scsi_host]#  kfed read /dev/asmdisk1  aun=5
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
7F0615CF6400 00000000 00000000 00000000 00000000  [................]
  Repeat 255 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

磁盘前几个au被破坏严重.而且相关的备份block都已经损坏,基于这种情况,直接参考:
asm磁盘dd破坏恢复
asm disk header 彻底损坏恢复
asm disk 磁盘部分被清空恢复
通过底层恢复出来相关数据文件,并检测正常
20221002105544


进一步通过au分配列表获恢复redo,ctl等文件

H:\TEMP\asm-ext4\other>dir 
 驱动器 H 中的卷是 SSD-SX
 卷的序列号是 84EB-F434

 H:\TEMP\asm-ext4\other 的目录

2022-09-30  21:52        25,165,824 256.dd
2022-09-30  21:52        25,165,824 257.dd
2022-09-30  23:52        52,429,312 258.dd.1
2022-09-30  23:54        52,429,312 259.dd.1
2022-09-30  23:55        52,429,312 260.dd.1
2022-09-30  23:55        52,429,312 261.dd.1
2022-09-30  23:56        52,429,312 270.dd.1
2022-09-30  23:57        52,429,312 271.dd.1
2022-09-30  23:57        52,429,312 272.dd.1
2022-09-30  23:57        52,429,312 273.dd.1
2022-09-30  23:58        52,429,312 274.dd.1
2022-10-01  00:01        52,429,312 275.dd.1
2022-10-01  00:00        52,429,312 276.dd.1
2022-10-01  00:00        52,429,312 277.dd.1
2022-10-01  00:00        52,429,312 278.dd.1
2022-09-30  23:59        52,429,312 279.dd.1
2022-09-30  23:59        52,429,312 280.dd.1
2022-09-30  23:59        52,429,312 281.dd.1

在另外的新机器上尝试恢复库

[oracle@xifenfei ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Sat Oct 1 10:18:58 2022

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup mount pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area 1519898624 bytes
Fixed Size                  2253464 bytes
Variable Size             939527528 bytes
Database Buffers          570425344 bytes
Redo Buffers                7692288 bytes
ORA-00227: corrupt block detected in control file: (block 8, # blocks 1)
ORA-00202: control file: '/oradata/256.dd'

控制文件损坏,重建ctl

SQL> CREATE CONTROLFILE REUSE DATABASE "xifenfei" NORESETLOGS  NOARCHIVELOG  
  2      MAXLOGFILES 50  
  3      MAXLOGMEMBERS 5  
  4      MAXDATAFILES 100  
  5      MAXINSTANCES 8  
  6      MAXLOGHISTORY 226  
  7  LOGFILE  
  8  group 7  '/oradata/270.dd.1' size 50M,
  9  group 8  '/oradata/272.dd.1' size 50M,
 10  group 5  '/oradata/274.dd.1' size 50M,
 11  group 6  '/oradata/276.dd.1' size 50M,
 12  group 3  '/oradata/278.dd.1' size 50M,
 13  group 4  '/oradata/280.dd.1' size 50M,
 14  group 1  '/oradata/258.dd.1' size 50M,
 15  group 2  '/oradata/260.dd.1' size 50M
 16  DATAFILE  
 17  '/oradata/1',
 18  '/oradata/2',
 19  '/oradata/3',
 20  '/oradata/4',
 21  '/oradata/5',
 22  '/oradata/6',
 23  '/oradata/7',
 24  '/oradata/8',
 25  '/oradata/9',
 26  '/oradata/10',
 27  '/oradata/11'
 28  CHARACTER SET ZHS16GBK  
 29  ;  

Control file created.

尝试open库,报ORA-600 kqfidps_update_stats:2,ORA-600 4194等错误

SQL> recover database;
Media recovery complete.
SQL> alter database open ;
alter database open 
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kqfidps_update_stats:2],
[0x7FFCCBEB3EC0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4193], [19319], [l.ok

解决该异常,open数据库成功

SQL> startup mount pfile='/tmp/pfile';
ORACLE instance started.

Total System Global Area 1519898624 bytes
Fixed Size                  2253464 bytes
Variable Size             939527528 bytes
Database Buffers          570425344 bytes
Redo Buffers                7692288 bytes
Database mounted.
SQL> alter database open;

Database altered.

导出数据库,遭遇个别表如下ORA-08103和ORA-01555两种错误,这种是由于个别block在做成文件系统的时候被损坏,底层恢复的时候block被置空导致,对其异常表进行单独处理即可

. . 正在导出表                           ALBUM
EXP-00056: 遇到 ORACLE 错误 8103
ORA-08103: 对象不再存在

. . 正在导出表                  M_PUSH_CONTENT
EXP-00056: 遇到 ORACLE 错误 1555
ORA-01555: 快照过旧: 回退段号  (名称为 "") 过小
ORA-22924: 快照太旧

通过上述操作,实现客户数据的恢复,最大限度挽回客户损坏,再次提醒对于asm disk进行了误操作,建议第一时间保护现场(不要有任何的写入操作,可以最大限度恢复数据)

ORA-15335 ORA-15130 ORA-15066 ORA-15196

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-15335 ORA-15130 ORA-15066 ORA-15196

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户反馈,数据库无法正常启动,通过分析asm的alert日志发现,data磁盘组mount成功之后,没有一会儿自动dismount掉

Mon Sep 26 16:40:14 2022
SQL> /* ASMCMD */ALTER DISKGROUP data MOUNT  
NOTE: cache registered group DATA number=2 incarn=0x9dfa705f
NOTE: cache began mount (first) of group DATA number=2 incarn=0x9dfa705f
NOTE: Assigning number (2,1) to disk (/dev/oracleasm/disks/DATA02)
NOTE: Assigning number (2,0) to disk (/dev/oracleasm/disks/DATA01)
Mon Sep 26 16:40:20 2022
NOTE: GMON heartbeating for grp 2
GMON querying group 2 at 68 for pid 25, osid 14650
NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/oracleasm/disks/DATA01
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/oracleasm/disks/DATA02
NOTE: cache mounting (first) external redundancy group 2/0x9DFA705F (DATA)
Mon Sep 26 16:40:20 2022
* allocate domain 2, invalid = TRUE 
kjbdomatt send to inst 2
Mon Sep 26 16:40:20 2022
NOTE: attached to recovery domain 2
NOTE: cache recovered group 2 to fcn 0.321845
NOTE: redo buffer size is 256 blocks (1053184 bytes)
Mon Sep 26 16:40:20 2022
NOTE: LGWR attempting to mount thread 1 for diskgroup 2 (DATA)
NOTE: LGWR found thread 1 closed at ABA 20.3546
NOTE: LGWR mounted thread 1 for diskgroup 2 (DATA)
NOTE: LGWR opening thread 1 at fcn 0.321845 ABA 21.3547
NOTE: cache mounting group 2/0x9DFA705F (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=2 incarn=0x9dfa705f
Mon Sep 26 16:40:20 2022
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 2
SUCCESS: diskgroup DATA was mounted
SUCCESS: /* ASMCMD */ALTER DISKGROUP data MOUNT 
Mon Sep 26 16:40:22 2022
WARNING: failed to online diskgroup resource ora.DATA.dg (unable to communicate with CRSD/OHASD)
Mon Sep 26 16:40:47 2022
NOTE: client xff1:xff registered, osid 14742, mbr 0x0
Mon Sep 26 16:40:57 2022
WARNING: cache read  a corrupt block: group=2(DATA) dsk=1 blk=257 disk=1 (DATA_0001) 
incarn=3916071178 au=113792 blk=1 count=1
Errors in file /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_14778.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
NOTE: a corrupted block from group DATA was dumped to /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_14778.trc
WARNING: cache read (retry) a corrupt block: group=2(DATA) dsk=1 blk=257 
disk=1 (DATA_0001) incarn=3916071178 au=113792 blk=1 count=1
Errors in file /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_14778.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ERROR: cache failed to read group=2(DATA) dsk=1 blk=257 from disk(s): 1(DATA_0001)
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
NOTE: cache initiating offline of disk 1 group DATA
NOTE: process _user14778_+asm1 (14778) initiating offline of 
disk 1.3916071178 (DATA_0001) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 1/0xe96a810a, mask = 0x6a, op = clear
Mon Sep 26 16:40:58 2022
GMON updating disk modes for group 2 at 70 for pid 28, osid 14778
ERROR: Disk 1 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Mon Sep 26 16:40:58 2022
NOTE: cache dismounting (not clean) group 2/0x9DFA705F (DATA) 
WARNING: Offline for disk DATA_0001 in mode 0x7f failed.
NOTE: messaging CKPT to quiesce pins Unix process pid: 14782, image: oracle@oracle11grac1 (B000)
Mon Sep 26 16:40:58 2022
NOTE: halting all I/Os to diskgroup 2 (DATA)
Errors in file /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_14778.trc  (incident=144548):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0001" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
Incident details in: /opt/grid/diag/asm/+asm/+ASM1/incident/incdir_144548/+ASM1_ora_14778_i144548.trc
Mon Sep 26 16:40:58 2022
Sweep [inc][144548]: completed
System State dumped to trace file /opt/grid/diag/asm/+asm/+ASM1/incident/incdir_144548/+ASM1_ora_14778_i144548.trc
Mon Sep 26 16:40:58 2022
NOTE: AMDU dump of disk group DATA created at /opt/grid/diag/asm/+asm/+ASM1/incident/incdir_144548
Mon Sep 26 16:41:00 2022
NOTE: LGWR doing non-clean dismount of group 2 (DATA)
NOTE: LGWR sync ABA=21.3550 last written ABA 21.3550
Mon Sep 26 16:41:00 2022
Sweep [inc2][144548]: completed
Mon Sep 26 16:41:00 2022
ERROR: ORA-15130 in COD recovery for diskgroup 2/0x9dfa705f (DATA)
ERROR: ORA-15130 thrown in RBAL for group number 2
Errors in file /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_5162.trc:
ORA-15130: diskgroup "DATA" is being dismounted

这里看主要是由于asm 磁盘组需要做COD recovery导致无法正常稳定的mount,主要原因是遭遇到asm disk的逻辑坏块(存储物理上看是ok的,但是实际数据在asm中看是异常的)

数据库alert日志报错

Mon Sep 26 16:40:52 2022
Successful mount of redo thread 1, with mount id 1097279951
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Lost write protection disabled
Completed: alter database mount
alter database open
This instance was first to open
Picked broadcast on commit scheme to generate SCNs
LGWR: STARTING ARCH PROCESSES
Mon Sep 26 16:40:56 2022
ARC0 started with pid=40, OS id=14761 
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Mon Sep 26 16:40:57 2022
ARC1 started with pid=41, OS id=14764 
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_lgwr_14479.trc:
ORA-00313: ??????? 1 (???? 1) ???
Mon Sep 26 16:40:57 2022
ARC2 started with pid=42, OS id=14766 
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_lgwr_14479.trc:
ORA-00313: ??????? 2 (???? 1) ???
Mon Sep 26 16:40:57 2022
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc:
ORA-00313: open failed for members of log group 1 of thread 1
Mon Sep 26 16:40:57 2022
ARC3 started with pid=44, OS id=14770 
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc:
ORA-00313: open failed for members of log group 1 of thread 1
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc2_14766.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc1_14764.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc  (incident=180281):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0001" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc0_14761.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: '+DATA/xff/onlinelog/group_1.271.1025610215'
ORA-17503: ksfdopn: 2 未能打开文件 +DATA/xff/onlinelog/group_1.271.1025610215
ORA-15130: diskgroup "DATA" is being dismounted
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc3_14770.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: '+DATA/xff/onlinelog/group_1.271.1025610215'
ORA-17503: ksfdopn: 2 未能打开文件 +DATA/xff/onlinelog/group_1.271.1025610215
ORA-15130: diskgroup "DATA" is being dismounted
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc0_14761.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: '+DATA/xff/onlinelog/group_1.271.1025610215'
ORA-17503: ksfdopn: 2 未能打开文件 +DATA/xff/onlinelog/group_1.271.1025610215
ORA-15130: diskgroup "DATA" is being dismounted
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc3_14770.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: '+DATA/xff/onlinelog/group_1.271.1025610215'
ORA-17503: ksfdopn: 2 未能打开文件 +DATA/xff/onlinelog/group_1.271.1025610215
ORA-15130: diskgroup "DATA" is being dismounted
Unable to create archive log file '+DATA'
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc:
ORA-19816: WARNING: Files may exist in db_recovery_file_dest that are not known to database.
ORA-17502: ksfdcre:4 Failed to create file +DATA
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0001" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
*************************************************************
WARNING: A file of type ARCHIVED LOG may exist in
db_recovery_file_dest that is not known to the database.
Use the RMAN command CATALOG RECOVERY AREA to re-catalog
any such files. If files cannot be cataloged, then manually
delete them using OS command. This is most likely the
result of a crash during file creation.
*************************************************************
ARCH: Error 19504 Creating archive log file to '+DATA'
NOTE: Deferred communication with ASM instance
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc:
ORA-15130: diskgroup "DATA" is being dismounted
NOTE: deferred map free for map id 23
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc:
ORA-16038: log 1 sequence# 14235 cannot be archived
ORA-19504: failed to create file ""
ORA-00312: online log 1 thread 1: '+DATA/xff/onlinelog/group_1.271.1025610215'
ORA-00312: online log 1 thread 1: '+ARCH/xff/onlinelog/group_1.279.1025610217'
Mon Sep 26 16:40:58 2022
Sweep [inc][180281]: completed
Sweep [inc2][180281]: completed
USER (ospid: 14732): terminating the instance due to error 16038
Mon Sep 26 16:40:59 2022
System state dump requested by (instance=1, osid=14732), summary=[abnormal instance termination].
Instance terminated by USER, pid = 14732

对于这类故障处理相对比较容易,通过patch asm,让data磁盘组稳定mount,然后open库,迁移数据,实现数据0丢失,完美恢复

ORA-01558: out of transaction ID’s in rollback segment SYSTEM

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-01558: out of transaction ID’s in rollback segment SYSTEM

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户数据库启动报ORA-01558: out of transaction ID’s in rollback segment SYSTorcl无法open成功,而且对应的备库也无法正常open(报同样错误)
ORA-01558


对ORA-01558错误分析

[oracle@iZbp1hx0enix3hix1kvyrxZ ~]$  oerr ora 1558
01558, 00000, "out of transaction ID's in rollback segment %s"
// *Cause: All the available transaction id's have been used
// *Action: Shutdown the instance and restart using other rollback segment(s),
//          then drop the rollback segment that has no more transaction id's.

官方对于该错误的解释为由于回滚段的事务ID被使用完,无法再分配从而报ORA-01558,解决方法是重启库使用其他回滚段,并删除这个异常回滚段.对于此次故障的报错,这个方法原则上行不通,因为异常的是systorcl里面的rollback回滚段(也就是我们通常说的系统回滚段).通过alert日志分析客户这个故障的前因后果。
正常运行的库,突然报该错误

Fri Jun 03 19:37:36 2022
Thread 1 advanced to log sequence 2766146 (LGWR switch)
  Current log# 3 seq# 2766146 morcl# 0: /oradata/orclr/group_3.310.962631159
Fri Jun 03 19:37:36 2022
LNS: Standby redo logfile selected for thread 1 sequence 2766146 for destination LOG_ARCHIVE_DEST_2
Fri Jun 03 19:37:37 2022
Archived Log entry 3560157 added for thread 1 sequence 2766145 ID 0xefc56f40 dest 1:
Fri Jun 03 19:37:58 2022
Errors in file /u01/app/oracle/diag/rdbms/orclr/orclr/trace/orclr_smon_9974.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl
Errors in file /u01/app/oracle/diag/rdbms/orclr/orclr/trace/orclr_smon_9974.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl
Errors in file /u01/app/oracle/diag/rdbms/orclr/orclr/trace/orclr_smon_9974.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl
Errors in file /u01/app/oracle/diag/rdbms/orclr/orclr/trace/orclr_smon_9974.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl
Errors in file /u01/app/oracle/diag/rdbms/orclr/orclr/trace/orclr_smon_9974.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl
Fri Jun 03 19:38:13 2022
Errors in file /u01/app/oracle/diag/rdbms/orclr/orclr/trace/orclr_smon_9974.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl

这个错误一直报了很长时间,等到有时间窗口,客户安排重启操作

Mon Sep 26 14:10:12 2022
Shutting down instance (immediate)
Shutting down instance: further logons disabled
License high water mark = 205
All dispatchers and shared servers shutdown
Errors in file /u01/app/oracle/diag/rdbms/orclr/orclr/trace/orclr_smon_9974.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl
ALTER DATABASE CLOSE NORMAL
ORA-1558 signalled during: ALTER DATABASE CLOSE NORMAL...
Mon Sep 26 14:10:22 2022
Errors in file /u01/app/oracle/diag/rdbms/orclr/orclr/trace/orclr_smon_9974.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl
Mon Sep 26 14:10:33 2022
Errors in file /u01/app/oracle/diag/rdbms/orclr/orclr/trace/orclr_smon_9974.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl
Mon Sep 26 14:10:43 2022
Errors in file /u01/app/oracle/diag/rdbms/orclr/orclr/trace/orclr_smon_9974.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl
Mon Sep 26 14:10:53 2022
Errors in file /u01/app/oracle/diag/rdbms/orclr/orclr/trace/orclr_smon_9974.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl
Mon Sep 26 14:10:55 2022
Shutting down instance (abort)
License high water mark = 205
USER (ospid: 25049): terminating the instance
Instance terminated by USER, pid = 25049

正常shutdown immediate执行失败,直接abort方式关闭库,然后尝试重启库

Mon Sep 26 14:12:16 2022
ARC3 started with pid=44, OS id=25978 
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Thread 1 opened at log sequence 2782827
  Current log# 1 seq# 2782827 morcl# 0: /oradata/orclr/group_1.296.962631151
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Sep 26 14:12:16 2022
SMON: enabling cache recovery
Mon Sep 26 14:12:16 2022
NSA2 started with pid=46, OS id=25982 
Archived Log entry 3593518 added for thread 1 sequence 2782826 ID 0xefc56f40 dest 1:
ARC3: Standby redo logfile selected for thread 1 sequence 2782826 for destination LOG_ARCHIVE_DEST_2
Errors in file /u01/app/oracle/diag/rdbms/orclr/orclr/trace/orclr_ora_25879.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl
Errors in file /u01/app/oracle/diag/rdbms/orclr/orclr/trace/orclr_ora_25879.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl
Error 604 happened during db open, shutting down database
USER (ospid: 25879): terminating the instance due to error 604
Instance terminated by USER, pid = 25879
ORA-1092 signalled during: ALTER DATABASE OPEN...
opiodr aborting process unknown ospid (25879) as a result of ORA-1092
Mon Sep 26 14:12:17 2022
ORA-1092 : opitsk aborting process

数据库重启失败,报ORA-604和ORA-01558错误.
对于这种情况,对于数据库启动过程进行跟踪确认在UPDATE UNDO$的时候无法分配事务,导致无法继续

PARSING IN CURSOR #139696084476000 len=160 dep=1 uid=0 oct=6 lid=0 tim=1664353484223278 hv=1292341136 
ad='6ff1ffa8' sqlid='8vyjutx6hg3wh'
update /*+ rule */ undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,
scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
END OF STMT
PARSE #139696084476000:c=5998,e=10708,p=7,cr=53,cu=0,mis=1,r=0,dep=1,og=3,plh=0,tim=1664353484223277
BINDS #139696084476000:
 Bind#0
  oacdty=01 mxl=32(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=18 fl2=0001 frm=01 csi=852 siz=32 off=0
  kxsbbbfp=6ff376da  bln=32  avl=21  flg=09
  value="_SYSSMU14_3733658264$"
 Bind#1
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f0d877d7a68  bln=24  avl=02  flg=05
  value=15
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f0d877d7a38  bln=24  avl=03  flg=05
  value=336
 Bind#3
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f0d877d7a00  bln=24  avl=02  flg=05
  value=5
 Bind#4
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f0d877d79d0  bln=24  avl=02  flg=05
  value=1
 Bind#5
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f0d877d79a0  bln=24  avl=03  flg=05
  value=5747
 Bind#6
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f0d877d7970  bln=24  avl=04  flg=05
  value=22103
 Bind#7
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f0d877d7938  bln=24  avl=06  flg=05
  value=1026171661
 Bind#8
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f0d877d7680  bln=24  avl=03  flg=05
  value=3399
 Bind#9
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f0d877d7650  bln=24  avl=02  flg=05
  value=2
 Bind#10
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f0d877d7620  bln=24  avl=02  flg=05
  value=5
 Bind#11
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f0d877d75f0  bln=24  avl=02  flg=05
  value=2
 Bind#12
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f0d877d7a98  bln=22  avl=02  flg=05
  value=14
EXEC #139696084476000:c=1000,e=1713,p=0,cr=1,cu=2,mis=1,r=0,dep=1,og=3,plh=3078630091,tim=1664353484225072
ERROR #139696084476000:err=1558 tim=1664353484225094
STAT #139696084476000 id=1 cnt=0 pid=0 pos=1 obj=0 op='UPDATE  UNDO$ (cr=0 pr=0 pw=0 time=5 us)'
STAT #139696084476000 id=2 cnt=1 pid=1 pos=1 obj=34 op='INDEX UNIQUE SCAN I_UNDO1 (cr=1 pr=0 pw=0 time=6 us)'
KQRCMT: Write failed with error=604 po=0x6ff375d0 cid=3
diagnostics : cid=3 hash=f2114ab9 flag=2a
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl
ORA-00604: error occurred at recursive SQL level 1
ORA-01558: out of transaction ID's in rollback segment SYSTorcl

既然报了事务ID不足,那对block进行分析,确实几乎都达到了数据库设计的理论最大值

  index  state cflags  wrap#    uel         scn            dba            parent-xid    nub     stmt_num
  ------------------------------------------------------------------------------------------------
   0x00    9    0x00  0xfffe  0x0026  0x111f.4d598aae  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x01    9    0x00  0xfffe  0x0054  0x111f.4d598a63  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x02    9    0x00  0xfffe  0x001a  0x111f.4d598a77  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x03    9    0x00  0xfffe  0x005c  0x111f.4d598a7d  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x04    9    0x00  0xfffe  0x003d  0x111f.4d598a4d  0x00400185  0x0000.000.00000000  0x00000001    0x00000000
   0x05    9    0x00  0xfffe  0x0061  0x111f.4d598a74  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x06    9    0x00  0xfffe  0x0002  0x111f.4d598a76  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x07    9    0x00  0xfffe  0x001b  0x111f.4d598a90  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x08    9    0x00  0xfffe  0x0011  0x111f.4d598a8c  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x09    9    0x00  0xfffe  0x0042  0x111f.4d598a6e  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x0a    9    0x00  0xfffe  0x0003  0x111f.4d598a7c  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x0b    9    0x00  0xfffe  0x000a  0x111f.4d598a7b  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x0c    9    0x00  0xfffe  0x0005  0x111f.4d598a73  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x0d    9    0x00  0xfffe  0x005d  0x111f.4d598a87  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x0e    9    0x00  0xfffe  0x0050  0x111f.4d598a68  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x0f    9    0x00  0xfffe  0x0047  0x111f.4d598a6a  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x10    9    0x00  0xfffe  0x0033  0x111f.4d598a95  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x11    9    0x00  0xfffe  0x0031  0x111f.4d598a8d  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x12    9    0x00  0xfffe  0x0020  0x111f.4d598a81  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x13    9    0x00  0xfffe  0x0014  0x111f.4d598a85  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x14    9    0x00  0xfffe  0x000d  0x111f.4d598a86  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x15    9    0x00  0xfffe  0x0013  0x111f.4d598a83  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x16    9    0x00  0xfffe  0x003b  0x111f.4d598aa4  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x17    9    0x00  0xfffe  0x005b  0x111f.4d598a70  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x18    9    0x00  0xfffe  0x001f  0x111f.4d598a8a  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x19    9    0x00  0xfffe  0x0010  0x111f.4d598a94  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x1a    9    0x00  0xfffe  0x001c  0x111f.4d598a79  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x1b    9    0x00  0xfffe  0x0027  0x111f.4d598a91  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x1c    9    0x00  0xfffe  0x000b  0x111f.4d598a7a  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x1d    9    0x00  0xfffe  0x0012  0x111f.4d598a80  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x1e    9    0x00  0xfffe  0x0045  0x111f.4d598aba  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x1f    9    0x00  0xfffe  0x0008  0x111f.4d598a8b  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x20    9    0x00  0xfffe  0x0015  0x111f.4d598a82  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x21    9    0x00  0xfffe  0x0038  0x111f.4d598ab2  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x22    9    0x00  0xfffd  0x003e  0x111f.4d598ab4  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x23    9    0x00  0xfffe  0x0028  0x111f.4d598a9e  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x24    9    0x00  0xfffd  0x0060  0x111f.4d598a5e  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x25    9    0x00  0xfffe  0x0021  0x111f.4d598ab0  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x26    9    0x00  0xfffe  0x0025  0x111f.4d598aaf  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x27    9    0x00  0xfffe  0x0019  0x111f.4d598a93  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x28    9    0x00  0xfffe  0x0046  0x111f.4d598a9f  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x29    9    0x00  0xfffe  0x0023  0x111f.4d598a9d  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x2a    9    0x00  0xfffe  0x002c  0x111f.4d598a98  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x2b    9    0x00  0xfffe  0x0040  0x111f.4d598aa7  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x2c    9    0x00  0xfffe  0x0030  0x111f.4d598a99  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x2d    9    0x00  0xfffe  0x003c  0x111f.4d598aab  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x2e    9    0x00  0xfffe  0x0056  0x111f.4d598abf  0x00400189  0x0000.000.00000000  0x00000001    0x00000000
   0x2f    9    0x00  0xfffe  0x0037  0x111f.4d598aa2  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x30    9    0x00  0xfffe  0x0034  0x111f.4d598a9a  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x31    9    0x00  0xfffe  0x0007  0x111f.4d598a8f  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x32    9    0x00  0xfffe  0x001e  0x111f.4d598ab9  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x33    9    0x00  0xfffe  0x002a  0x111f.4d598a97  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x34    9    0x00  0xfffe  0x0029  0x111f.4d598a9c  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x35    9    0x00  0xfffd  0x005f  0x111f.4d598a5b  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x36    9    0x00  0xfffe  0x0049  0x111f.4d598ac6  0x00400189  0x0000.000.00000000  0x00000001    0x00000000
   0x37    9    0x00  0xfffe  0x0016  0x111f.4d598aa3  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x38    9    0x00  0xfffe  0x0022  0x111f.4d598ab3  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x39    9    0x00  0xfffd  0x0058  0x111f.4d598a52  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x3a    9    0x00  0xfffd  0x0048  0x111f.4d598a57  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x3b    9    0x00  0xfffe  0x002b  0x111f.4d598aa5  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x3c    9    0x00  0xfffe  0x0000  0x111f.4d598aac  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x3d    9    0x00  0xfffd  0x0043  0x111f.4d598a4f  0x00400185  0x0000.000.00000000  0x00000001    0x00000000
   0x3e    9    0x00  0xfffe  0x003f  0x111f.4d598ab6  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x3f    9    0x00  0xfffe  0x0032  0x111f.4d598ab7  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x40    9    0x00  0xfffe  0x005a  0x111f.4d598aa8  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x41    9    0x00  0xfffe  0x004b  0x111f.4d598abc  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x42    9    0x00  0xfffd  0x0017  0x111f.4d598a6f  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x43    9    0x00  0xfffd  0x004d  0x111f.4d598a50  0x00400185  0x0000.000.00000000  0x00000001    0x00000000
   0x44    9    0x00  0xfffe  0x005e  0x111f.4d598ac2  0x00400189  0x0000.000.00000000  0x00000001    0x00000000
   0x45    9    0x00  0xfffe  0x0041  0x111f.4d598abb  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x46    9    0x00  0xfffe  0x002f  0x111f.4d598aa0  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x47    9    0x00  0xfffd  0x0059  0x111f.4d598a6b  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x48    9    0x00  0xfffd  0x0052  0x111f.4d598a58  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x49    9    0x00  0xfffe  0xffff  0x111f.4d598ac7  0x00400189  0x0000.000.00000000  0x00000001    0x00000000
   0x4a    9    0x00  0xfffd  0x003a  0x111f.4d598a56  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x4b    9    0x00  0xfffe  0x002e  0x111f.4d598abe  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x4c    9    0x00  0xfffd  0x0024  0x111f.4d598a5d  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x4d    9    0x00  0xfffd  0x0039  0x111f.4d598a51  0x00400185  0x0000.000.00000000  0x00000001    0x00000000
   0x4e    9    0x00  0xfffd  0x0001  0x111f.4d598a62  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x4f    9    0x00  0xfffd  0x000e  0x111f.4d598a66  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x50    9    0x00  0xfffd  0x000f  0x111f.4d598a69  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x51    9    0x00  0xfffe  0x0036  0x111f.4d598ac5  0x00400189  0x0000.000.00000000  0x00000001    0x00000000
   0x52    9    0x00  0xfffd  0x0035  0x111f.4d598a59  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x53    9    0x00  0xfffd  0x004a  0x111f.4d598a55  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x54    9    0x00  0xfffd  0x0055  0x111f.4d598a64  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x55    9    0x00  0xfffd  0x004f  0x111f.4d598a65  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x56    9    0x00  0xfffe  0x0044  0x111f.4d598ac1  0x00400189  0x0000.000.00000000  0x00000001    0x00000000
   0x57    9    0x00  0xfffd  0x004e  0x111f.4d598a60  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x58    9    0x00  0xfffd  0x0053  0x111f.4d598a53  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x59    9    0x00  0xfffd  0x0009  0x111f.4d598a6c  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x5a    9    0x00  0xfffd  0x002d  0x111f.4d598aa9  0x00400188  0x0000.000.00000000  0x00000001    0x00000000
   0x5b    9    0x00  0xfffd  0x000c  0x111f.4d598a71  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x5c    9    0x00  0xfffd  0x001d  0x111f.4d598a7f  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x5d    9    0x00  0xfffd  0x0018  0x111f.4d598a88  0x00400187  0x0000.000.00000000  0x00000001    0x00000000
   0x5e    9    0x00  0xfffe  0x0051  0x111f.4d598ac3  0x00400189  0x0000.000.00000000  0x00000001    0x00000000
   0x5f    9    0x00  0xfffd  0x004c  0x111f.4d598a5c  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x60    9    0x00  0xfffd  0x0057  0x111f.4d598a5f  0x00400186  0x0000.000.00000000  0x00000001    0x00000000
   0x61    9    0x00  0xfffd  0x0006  0x111f.4d598a75  0x00400187  0x0000.000.00000000  0x00000001    0x00000000

通过bbed修改相关值,规避掉此类问题

m /x 6c000000 offset 6513
m /x 7c000000 offset 6752
m /x 8c000000 offset 7048

启动数据库成功
20220928211705


ORA-600 3417故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 3417故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库突然报ORA-600 3417错误

Mon Sep 26 06:42:51 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7984.trc  (incident=176185):
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\incident\incdir_176185\orcl2_lgwr_7984_i176185.trc
Mon Sep 26 06:42:54 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7984.trc:
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
LGWR (ospid: 7984): terminating the instance due to error 470

节点2异常之后,节点1由于跨节点实例恢复导致异常

Mon Sep 26 06:44:26 2022
Instance recovery: looking for dead threads
Beginning instance recovery of 1 threads
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
 parallel recovery started with 31 processes
Started redo scan
Completed redo scan
 read 887 KB redo, 348 data blocks need recovery
Started redo application at
 Thread 2: logseq 9907, block 1980
Recovery of Online Redo Log: Thread 2 Group 3 Seq 9907 Reading mem 0
  Mem# 0: +DATA/orcl/onlinelog/group_3.265.1078882689
  Mem# 1: +OCR/orcl/onlinelog/group_3.259.1078882689
Completed redo application of 0.32MB
Completed instance recovery at
 Thread 2: logseq 9907, block 3755, scn 231951271
 338 data blocks read, 348 data blocks written, 887 redo k-bytes read
Mon Sep 26 06:44:36 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_smon_7972.trc  (incident=208205):
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
Mon Sep 26 06:44:38 2022
Reconfiguration started (old inc 14, new inc 16)
List of instances:
 1 2 (myinst: 1) 
 Global Resource Directory frozen
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Mon Sep 26 06:44:38 2022
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Sep 26 06:44:38 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Sep 26 06:44:38 2022
 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Sep 26 06:44:38 2022
 LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info 
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
ORA-600 occurred during recovery, instance will be terminated
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_smon_7972.trc:
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
System state dump requested by (instance=1,osid=7972 (SMON)),summary=[abnormal instance termination].
SMON (ospid: 7972): terminating the instance due to error 600
Mon Sep 26 06:44:43 2022
ORA-1092 : opitsk aborting process
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_diag_6956_20220926064442.trc:
ORA-00601: ??????
Mon Sep 26 06:44:46 2022
opiodr aborting process unknown ospid (6688) as a result of ORA-1092

再次重启实例无法正常启动报ORA-600 3417错误

Completed: ALTER DATABASE   MOUNT
Mon Sep 26 08:08:34 2022
alter database open
This instance was first to open
Picked broadcast on commit scheme to generate SCNs
Mon Sep 26 08:08:36 2022
LGWR: STARTING ARCH PROCESSES
Mon Sep 26 08:08:36 2022
ARC0 started with pid=39, OS id=5004 
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Mon Sep 26 08:08:37 2022
ARC1 started with pid=38, OS id=3568 
Mon Sep 26 08:08:37 2022
ARC2 started with pid=41, OS id=3308 
Mon Sep 26 08:08:37 2022
ARC3 started with pid=42, OS id=8180 
Mon Sep 26 08:08:37 2022
ARC4 started with pid=43, OS id=7768 
Mon Sep 26 08:08:37 2022
ARC5 started with pid=44, OS id=4628 
Mon Sep 26 08:08:37 2022
ARC6 started with pid=45, OS id=6920 
Mon Sep 26 08:08:37 2022
ARC7 started with pid=46, OS id=7960 
ARC1: Archival started
ARC2: Archival started
ARC3: Archival started
ARC4: Archival started
ARC5: Archival started
ARC6: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7924.trc(incident=400186):
ORA-00600: ??????, ??: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
ARC7: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Mon Sep 26 08:08:39 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7924.trc:
ORA-00600: ??????, ??: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
LGWR (ospid: 7924): terminating the instance due to error 470
Mon Sep 26 08:08:44 2022
ORA-1092 : opitsk aborting process

故障比较明显,数据库两个节点故障之后,启动其中一个节点,已经完成了实例恢复,但是无法open,通过10046进行跟踪发下你open过程没有执行任何语句直接导致win服务异常,然后终止
111


通过进一步分析确认是redo组异常

SQL> select group#,sequence#,status,thread# from v$log;

    GROUP#  SEQUENCE# STATUS              THREAD#
---------- ---------- ---------------- ----------
         1      10837 CURRENT                   1
         2          0 UNUSED                    1
         3       9907 CURRENT                   2
         4          0 UNUSED                    2

进一步查询数据文件是否正常

SQL> set pages 10000
SQL> set numw 16
SQL> SELECT status,
  2  checkpoint_change#,
  3  checkpoint_time,last_change#,
  4  count(*) ROW_NUM
  5  FROM v$datafile
  6  GROUP BY status, checkpoint_change#, checkpoint_time,last_change#
  7  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS  CHECKPOINT_CHANGE# CHECKPOINT_TIME         LAST_CHANGE#          ROW_NUM

------- ------------------ ------------------- ---------------- ----------------

ONLINE           231971314 2022-09-26 06:44:37        231971314               44

SYSTEM           231971314 2022-09-26 06:44:37        231971314                1



SQL> set numw 16
SQL> col CHECKPOINT_TIME for a40
SQL> set lines 150
SQL> set pages 1000
SQL> SELECT status,
  2  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,chec
kpoint_change#,
  3  count(*) ROW_NUM
  4  FROM v$datafile_header
  5  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh
24:mi:ss'),fuzzy
  6  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS  CHECKPOINT_TIME                          FUZ CHECKPOINT_CHANGE#
 ROW_NUM
------- ---------------------------------------- --- ------------------ --------
--------
ONLINE  2022-09-26 06:44:37                      NO           231971314
      45

基于上述情况,对于这个库,可以直接open库,实现数据0丢失

又一例asm disk 加入vg故障

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:又一例asm disk 加入vg故障

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

又一客户把asm disk加入到vg,并且扩容到lv中
asm-disk-vg


通过asm层面查看
20220914191351
20220914191418

ASMCMD> lsdsk
Path
/dev/asmdisk/asm-arch01
/dev/asmdisk/asm-data01
/dev/asmdisk/asm-ocr01
/dev/asmdisk/asm-ocr02
/dev/asmdisk/asm-ocr03
dbsrv2-> ls -l /dev/asm*
/dev/asm:
total 0

/dev/asmdisk:
total 0
lrwxrwxrwx 1 root root 6 Sep 14 17:39 asm-arch01 -> ../sdb
lrwxrwxrwx 1 root root 6 Sep 14 17:35 asm-data01 -> ../sda
lrwxrwxrwx 1 root root 6 Sep 11 09:11 asm-ocr01 -> ../sdc
lrwxrwxrwx 1 root root 6 Sep 11 09:11 asm-ocr02 -> ../sdd
lrwxrwxrwx 1 root root 6 Sep 11 09:11 asm-ocr03 -> ../sde

对于这类情况,由于客户的系统是ext4,根据这个文件系统特性,每隔2G会有一点破坏,最终数据恢复效果看运气,运气好直接通过元数据恢复出来所有数据文件,然后open库,然后不好可能需要底层碎片等,参见类似恢复:
asm disk被加入vg恢复
asm disk 磁盘部分被清空恢复
文件系统重新分区oracle恢复
删除分区 oracle asm disk 恢复
pvcreate asm disk导致asm磁盘组异常恢复
对于使用asm的客户,在对文件系统进行操作时,一定要注意asm disk,别弄错磁盘(把asm disk磁盘给误操作掉了),适当情况下linux平台可以考虑AFD(ASM FILTER DRIVER)