存储双活系统逻辑损坏数据库抢救恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:存储双活系统逻辑损坏数据库抢救恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

计划休假的前一夜晚上节点朋友求救电话,说xx医院核心his系统的Oracle数据库很多表报ORA-8103错误,业务无法正常办理.
ora-8103


通过dbv检查文件发现连续坏块
2


根据以往经验数据库出现类似这样的错误,很可能是底层问题,查看系统日志发现大量磁盘错误
3


该报错时间和应用反馈系统异常时间基本上匹配,初步怀疑是硬件或者os异常导致.因为客户数据库大量表表ORA-8103,而且有文件出现连片被置空,无法准确定位数据库损坏情况(置空值数据库级别的物理损坏,ora-8103是逻辑错误在表不被访问的情况下无法检查出来),考虑分析客户的硬件环境,备份容灾情况,分析选择最佳方案.
通过和客户沟通以及检查数据库的相关情况发现信息如下:
1)存储使用的是xx厂商的双活方案,这种存储级解决方案对于该故障来说没用,因为是lun的逻辑级别损坏,损坏数据同时同步到两套存储上.
2)数据库库容灾使用的是某厂家的cdp同步容灾,客户对cdp库进行分析,发现数据同步异常,基本上该方案也无法使用
3)数据库的备份情况:由于存放数据库备份的存储电池异常和有坏盘导致存储写io效率非常低,客户在3天之前停止掉了文件系统中的rman备份;有tsm的带库备份,结果检查发现竟无一次备份成功.
故障进一步扩大
针对客户情况,确定是节点2有明显异常,准备停掉节点2的数据库和集群,然后看下在节点1上是否有改善,结果发现把节点2的crs停掉之后,节点1的库直接crash,通过分析发现asm disk有一块盘磁盘前几M表直接置空(应该是在关闭crs之前就已经异常,只是因为磁盘头部分数据没有相关操作,因此没有触发相关问题),当一个节点关闭会去写磁盘头信息,asm发现异常直接dismount 节点1的磁盘组了,从而使得节点1的库异常.
4


5


现在的情况:
1)现在的asm 磁盘组异常(其中一个磁盘头前几M损坏),也就是说在原库基础上直接修复的概率基本上没有可能
2)cdp数据异常,不可用
3)在数据库相关服务器中找到一份4天之前的一次全备
恢复思路:
1.客户准备新空间,直接把4天之前的备份还原到本地文件系统中
2.通过底层工具对于有磁盘损坏的asm磁盘组进行分析,尝试恢复归档日志和redo(尽可能做到最大限度恢复数据)
3.通过备份还原4天之前的备份结合我们恢复的归档日志和redo尝试完全恢复数据
4.问题风险,就算归档日志和redo从损坏的asm 磁盘组中恢复出来,但是也有可能损坏,导致后面无法恢复到最新数据(造成数据丢失)
实际操作:
1. 由于客户在昨天晚上故障之后增加了一些undo数据文件,使得无法正常全库restore database(因为ctl中数据文件信息比备份集中多)
2. 后续由于10204 rac还原到单机出现ORA-600 kgeade_is_1错误
6


3. 数据库恢复完成之后,出现sqlplus 操作数据库正常,plsql dev和应用访问数据库报ora-27092的问题
7


最后运气不错,经过一系列努力,数据库open成功,应用也正常访问,最初生产环境中损坏的表现在查询也不再报ORA-8103,dbv检查异常文件也ok
8


9


再次提醒各位朋友:
1)你的数据库备份是否正常,建议定期做故障演练
2)选择合适数据库的容灾方案,建议定期检查或者演练
3)存储双活可以解决硬件故障问题,但是还要有适当的解决方案来规避存储逻辑错误风险.

数据库open过程遭遇ORA-1555对应sql语句补充

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:数据库open过程遭遇ORA-1555对应sql语句补充

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在2015年的在数据库open过程中常遇到ORA-01555汇总文章中写过oracle open过程中可能会遇到ORA-01555错误,对应的sql语句.最近的恢复中又遇到两个新的,对其进行补充
select rowcnt,blkcnt,empcnt,avgspc,chncnt,avgrln,nvl(degree,1), nvl(instances,1) from tab$ where obj# = :1

Thu May 09 02:10:27 2019
SMON: enabling cache recovery
ORA-01555 caused by SQL statement below (SQL ID: bqbdby3c400p7, SCN: 0x0000.3e785fc7):
select rowcnt,blkcnt,empcnt,avgspc,chncnt,avgrln,nvl(degree,1), nvl(instances,1) from tab$ where obj# = :1
Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_ora_15929.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 91 with name "_SYSSMU91_1360910548$" too small
Errors in file /home/u01/diag/rdbms/orcl/orcl/trace/orcl_ora_15929.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 91 with name "_SYSSMU91_1360910548$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 15929): terminating the instance due to error 704
Instance terminated by USER, pid = 15929
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (15929) as a result of ORA-1092
Thu May 09 02:10:28 2019
ORA-1092 : opitsk aborting process

select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null

NSA2 started with pid=41, OS id=32571518
ORA-01555 caused by SQL statement below (SQL ID: 3nkd3g3ju5ph1, Query Duration=0 sec, SCN: 0x0005.e4bea784):
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from
    obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
Errors in file /u01/app/oracle/diag/rdbms/xifenfei_std/xifenfei/trace/xifenfei_ora_18939904.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 7 with name "_SYSSMU7_542380376$" too small
Errors in file /u01/app/oracle/diag/rdbms/xifenfei_std/xifenfei/trace/xifenfei_ora_18939904.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 7 with name "_SYSSMU7_542380376$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 18939904): terminating the instance due to error 704
Instance terminated by USER, pid = 18939904
ORA-1092 signalled during: alter database open RESETLOGS...
opiodr aborting process unknown ospid (18939904) as a result of ORA-1092

ORA-600 kcffo_online_pdb_check: fno_system 和 ORA-600 kcvfdb_pdb_set_clean_scn: cleanckpt错误

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600 kcffo_online_pdb_check: fno_system 和 ORA-600 kcvfdb_pdb_set_clean_scn: cleanckpt错误

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在做18c模拟故障测试中,经过自己一系列折腾,主要遭遇了ORA-600 kcffo_online_pdb_check: fno_system 和 ORA-600 kcvfdb_pdb_set_clean_scn: cleanckpt错误,这些都是pdb特有的,主要是由于一些bug引起,在非pdb环境中不太可能遇到.其实这也就是说明由于pdb机制的引入,使得后续的数据库异常恢复中会更加复杂.
18c数据库open ORA-00603 ORA-01092 ORA-00600报错

[oracle@ora11g tmp]$ ss
SQL*Plus: Release 18.0.0.0.0 - Production on Sat Apr 20 21:18:09 2019
Version 18.3.0.0.0
Copyright (c) 1982, 2018, Oracle.  All rights reserved.
Connected to:
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.3.0.0.0
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
BANNER_FULL
--------------------------------------------------------------------------------
BANNER_LEGACY
--------------------------------------------------------------------------------
    CON_ID
----------
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.3.0.0.0
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
         0
BANNER
--------------------------------------------------------------------------------
BANNER_FULL
--------------------------------------------------------------------------------
BANNER_LEGACY
--------------------------------------------------------------------------------
    CON_ID
----------
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [],
[], [], []
Process ID: 55775
Session ID: 135 Serial number: 49652

alert日志信息

Database Characterset is AL32UTF8
2019-04-20T18:20:54.256841+08:00
No Resource Manager plan active
2019-04-20T18:20:56.751241+08:00
replication_dependency_tracking turned off (no async multimaster replication found)
2019-04-20T18:20:57.862516+08:00
Starting background process AQPC
2019-04-20T18:20:58.341991+08:00
AQPC started with pid=45, OS id=55830
2019-04-20T18:21:01.476252+08:00
PDB$SEED(2):Autotune of undo retention is turned on.
2019-04-20T18:21:01.738732+08:00
Pdb PDB$SEED hit error 1157 during open read only (2) and will be closed.
2019-04-20T18:21:01.755310+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc:
ORA-01157: cannot identify/lock data file 5 - see DBWR trace file
ORA-01110: data file 5: '/u01/app/oracle/oradata/ORCL18C/pdbseed/system01.dbf'
PDB$SEED(2):JIT: pid 55775 requesting stop
PDB$SEED(2):Buffer Cache flush deferred for PDB 2
Could not open PDB$SEED error=1157
2019-04-20T18:21:01.887601+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc:
ORA-01157: cannot identify/lock data file 5 - see DBWR trace file
ORA-01110: data file 5: '/u01/app/oracle/oradata/ORCL18C/pdbseed/system01.dbf'
2019-04-20T18:21:03.385503+08:00
PDB1(3):Autotune of undo retention is turned on.
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_p000_55808.trc  (incident=66865) (PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [kcffo_online_pdb_check: fno_system], [3], [], [], [], [], [], [], [], [], [], []
2019-04-20T18:21:03.682428+08:00
PDB2(4):Autotune of undo retention is turned on.
Incident details in: /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/incident/incdir_66865/orcl18c_p000_55808_i66865.trc
2019-04-20T18:21:12.863880+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2019-04-20T18:21:12.879921+08:00
Pdb PDB1 hit error 600 during open read write (5) and will be closed.
2019-04-20T18:21:12.880506+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_p000_55808.trc:
ORA-00600: internal error code, arguments: [kcffo_online_pdb_check: fno_system], [3], [], [], [], [], [], [], [], [], [], []
PDB1(3):JIT: pid 55808 requesting stop
2019-04-20T18:21:12.915407+08:00
Dumping diagnostic data in directory=[cdmp_20190420182112], requested by (instance=1, osid=55808 (P000)), summary=[incident=66865].
2019-04-20T18:21:12.989890+08:00
PDB1(3):Buffer Cache flush deferred for PDB 3
2019-04-20T18:21:13.004575+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_p001_55810.trc  (incident=66873) (PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [kcffo_online_pdb_check: fno_system], [4], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/incident/incdir_66873/orcl18c_p001_55810_i66873.trc
2019-04-20T18:21:17.218642+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2019-04-20T18:21:17.222057+08:00
Pdb PDB2 hit error 600 during open read write (5) and will be closed.
2019-04-20T18:21:17.222236+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_p001_55810.trc:
ORA-00600: internal error code, arguments: [kcffo_online_pdb_check: fno_system], [4], [], [], [], [], [], [], [], [], [], []
2019-04-20T18:21:17.260941+08:00
Dumping diagnostic data in directory=[cdmp_20190420182117], requested by (instance=1, osid=55810 (P001)), summary=[incident=66873].
2019-04-20T18:21:17.262023+08:00
PDB2(4):JIT: pid 55810 requesting stop
PDB2(4):Buffer Cache flush deferred for PDB 4
2019-04-20T18:21:17.352939+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2019-04-20T18:21:17.483695+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc  (incident=66857) (PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [kcffo_online_pdb_check: fno_system], [3], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/incident/incdir_66857/orcl18c_ora_55775_i66857.trc
2019-04-20T18:21:22.612339+08:00
*****************************************************************
An internal routine has requested a dump of selected redo.
This usually happens following a specific internal error, when
analysis of the redo logs will help Oracle Support with the
diagnosis.
It is recommended that you retain all the redo logs generated (by
all the instances) during the past 12 hours, in case additional
redo dumps are required to help with the diagnosis.
*****************************************************************
2019-04-20T18:21:26.062635+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc  (incident=66858) (PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/incident/incdir_66858/orcl18c_ora_55775_i66858.trc
2019-04-20T18:21:26.506644+08:00
Dumping diagnostic data in directory=[cdmp_20190420182126], requested by (instance=1, osid=55775), summary=[incident=66857].
2019-04-20T18:21:30.119381+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2019-04-20T18:21:30.119505+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [], [], [], []
2019-04-20T18:21:30.119629+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [], [], [], []
2019-04-20T18:21:30.119719+08:00
Error 600 happened during db open, shutting down database
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc  (incident=66859) (PDBNAME=CDB$ROOT):
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/incident/incdir_66859/orcl18c_ora_55775_i66859.trc
2019-04-20T18:21:30.346811+08:00
Dumping diagnostic data in directory=[cdmp_20190420182130], requested by (instance=1, osid=55775), summary=[incident=66858].
2019-04-20T18:21:34.551418+08:00
opiodr aborting process unknown ospid (55775) as a result of ORA-603
2019-04-20T18:21:34.720885+08:00
ORA-603 : opitsk aborting process
License high water mark = 4
2019-04-20T18:21:34.754766+08:00
USER (ospid: 55775): terminating the instance due to ORA error 600
2019-04-20T18:21:35.839992+08:00
Instance terminated by USER, pid = 55775

alert日志提示文件不存在,实际上文件是存在的

[root@ora11g ~]#
[root@ora11g ~]# ls -l /u01/app/oracle/oradata/ORCL18C/pdbseed/system01.dbf
-rw-r-----. 1 oracle oinstall 283123712 4月  13 23:59 /u01/app/oracle/oradata/ORCL18C/pdbseed/system01.dbf

主要错误是ORA-600 kcffo_online_pdb_check: fno_system 查询mos发现主要是由于数据库bug导致,查询mos发现不少bug
KCFFO_ONLINE_PDB_CHECK FNO_SYSTEM]


官方解释,主要是由于kcffo_online_pdb函数执行异常导致

Function kcffo_online_pdb_check  Check if it ok to online the files in a
pluggable database. An error is  signalled if it is not ok. This routine will
grab a file enqueue for the relevant files and save it in the NULL-terminated
array fenqsp. The caller must release these after kcffo_online_pdb() or if an
error occurs.

通过人工修改文件状态,绕过该错误,cdb数据库open成功,但是pdb依旧无法正常open

SQL> alter database open;
Database altered.
SQL>  alter session set container=PDB1;
Session altered.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kcvfdb_pdb_set_clean_scn: cleanckpt], [3], [1739494], [38655308813], [2], [], [], [], [], [], [], []

alert日志

PDB1(3):alter database open
PDB1(3):Autotune of undo retention is turned on.
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_56178.trc  (incident=69185) (PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [kcvfdb_pdb_set_clean_scn: cleanckpt], [3], [1739494], [38655308813], [2], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/incident/incdir_69185/orcl18c_ora_56178_i69185.trc
2019-04-20T18:31:41.761479+08:00
*****************************************************************
An internal routine has requested a dump of selected redo.
This usually happens following a specific internal error, when
analysis of the redo logs will help Oracle Support with the
diagnosis.
It is recommended that you retain all the redo logs generated (by
all the instances) during the past 12 hours, in case additional
redo dumps are required to help with the diagnosis.
*****************************************************************
2019-04-20T18:31:42.097465+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Pdb PDB1 hit error 600 during open read write (1) and will be closed.
2019-04-20T18:31:42.097847+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_56178.trc:
ORA-00600: internal error code, arguments: [kcvfdb_pdb_set_clean_scn: cleanckpt], [3], [1739494], [38655308813], [2], [], [], [], [], [], [], []
PDB1(3):JIT: pid 56178 requesting stop
2019-04-20T18:31:42.098808+08:00
Dumping diagnostic data in directory=[cdmp_20190420183142], requested by (instance=1, osid=56178), summary=[incident=69185].
2019-04-20T18:31:42.138818+08:00
PDB1(3):Buffer Cache flush deferred for PDB 3
PDB1(3):ORA-600 signalled during: alter database open...

主要错误是ORA-600 kcvfdb_pdb_set_clean_scn: cleanckpt,通过查询mos,依旧发现mos上有的主要可能的bug
kcvfdb_pdb_set_clean_scn cleanckpt


通过人工修改数据文件的checkpoint scn解决该问题,pdb open成功

ORACLE Instance XFF (pid = 18) – Error 600 encountered while recovering transaction

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORACLE Instance XFF (pid = 18) – Error 600 encountered while recovering transaction

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

分享一次由于一个表异常导致数据库报类似:ORACLE Instance XFF (pid = 18) – Error 600 encountered while recovering transaction故障的案例
一个10.2.0.4的数据库,正常运行的库突然出现如下错误

Sun Apr 07 11:07:12 2019
Thread 1 advanced to log sequence 602883 (LGWR switch)
  Current log# 3 seq# 602883 mem# 0: L:\ORADATA\XFF\REDO03.LOG
Sun Apr 07 11:10:38 2019
Thread 1 advanced to log sequence 602884 (LGWR switch)
  Current log# 1 seq# 602884 mem# 0: L:\ORADATA\XFF\REDO01.LOG
Sun Apr 07 11:11:56 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\udump\XFF_ora_22956.trc:
ORA-00600: 内部错误代码, 参数: [ktspgfb-1], [], [], [], [], [], [], []
Sun Apr 07 11:12:46 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\udump\XFF_ora_27408.trc:
ORA-00600: 内部错误代码, 参数: [kcbnew_3], [0], [1], [168354056], [], [], [], []
Sun Apr 07 11:13:57 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\udump\XFF_ora_6632.trc:
ORA-00600: 内部错误代码, 参数: [ktspgfb-1], [], [], [], [], [], [], []

过一段时间报,然后实例直接crash

Tue Apr 09 07:47:35 2019
ORACLE Instance XFF (pid = 18) - Error 600 encountered while recovering transaction (1, 1) on object 113718002.
Tue Apr 09 07:47:35 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_smon_12948.trc:
ORA-00600: internal error code, arguments: [kcbgcur_3], [168454497], [8], [4], [0], [], [], []
Tue Apr 09 07:55:23 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_pmon_22652.trc:
ORA-00474: SMON process terminated with error
Tue Apr 09 07:55:24 2019
PMON: terminating instance due to error 474
Tue Apr 09 07:55:24 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_lgwr_28608.trc:
ORA-00474: SMON process terminated with error
Tue Apr 09 07:55:34 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_psp0_12544.trc:
ORA-00474: SMON process terminated with error
Tue Apr 09 07:55:34 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_j000_5216.trc:
ORA-00474: SMON process terminated with error
Tue Apr 09 07:55:35 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_ckpt_28204.trc:
ORA-00474: SMON process terminated with error
Tue Apr 09 07:55:36 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_mman_9320.trc:
ORA-00474: SMON process terminated with error
Tue Apr 09 07:55:44 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_q002_24384.trc:
ORA-00474: SMON process terminated with error
Tue Apr 09 07:55:53 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_reco_24124.trc:
ORA-00474: SMON process terminated with error

根据以上报错,数据库crash的原因是由于undo异常导致,通过对undo进行重建,解决掉异常undo,但是业务运行之后,一样的问题又重现,最后通过分析确认是对象异常导致

SQL> create table XFF.T_XIFENFEI_xff as select * from XFF.T_XIFENFEI;
create table XFF.T_XIFENFEI_xff as select * from XFF.T_XIFENFEI
                                                           *
ERROR at line 1:
ORA-00600: internal error code, arguments: [kcbz_check_objd_typ], [0], [0], [1], [], [], [], []
屏蔽相关block obj的check之后
SQL> create table XFF.T_XIFENFEI_xff as select * from XFF.T_XIFENFEI;
create table XFF.T_XIFENFEI_xff as select * from XFF.T_XIFENFEI
                                                           *
ERROR at line 1:
ORA-00600: internal error code, arguments: [ktspScanInit-l1], [], [], [], [],[], [], []

比较明显该表对象出现逻辑异常,通过基于rowid的方式对该表数据进行抽取

SQL> create table XFF.T_XIFENFEI_new
  2  as
  3  select * from XFF.T_XIFENFEI where 1=0;
Table created.
SQL> set serveroutput on
SQL> set concat off
SQL> DECLARE
  2   nrows number;
  3   rid rowid;
  4   dobj number;
  5   ROWSPERBLOCK number;
  6  BEGIN
  7   ROWSPERBLOCK:=1000;
  8   nrows:=0;
  9   select data_object_id  into dobj
 10   from dba_objects
 11   where owner = 'XFF'
 12   and object_name = 'T_XIFENFEI'
 13   ;
 14   for i in (select relative_fno, block_id, block_id+blocks-1 totblocks
 15             from dba_extents
 16             where owner = 'XFF'
 17               and segment_name = 'T_XIFENFEI'
 18            order by extent_id)
 19   loop
 20     for br in i.block_id..i.totblocks loop
 21      for j in 1..ROWSPERBLOCK loop
 22      begin
 23        rid := dbms_rowid.ROWID_CREATE(1,dobj,i.relative_fno, br , j-1);
 24        insert into XFF.T_XIFENFEI_NEW
 25        select /*+ ROWID(A) */ *
 26        from XFF.T_XIFENFEI A
 27        where rowid = rid;
 28        if sql%rowcount = 1 then nrows:=nrows+1; end if;
 29        if (mod(nrows,10000)=0) then commit; end if;
 30      exception when others then null;
 31      end;
 32      end loop;
 33    end loop;
 34   end loop;
 35   COMMIT;
 36   dbms_output.put_line('Total rows: '||to_char(nrows));
 37  END;
 38  /
Total rows: 227000
PL/SQL procedure successfully completed.

再次观察数据库恢复正常,也不再crash和报错,恢复完成

linux资源限制导致数据库异常

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:linux资源限制导致数据库异常

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一起由于liunx系统资源限制导致数据库无法启动案例分享
数据库启动报ORA-01157错

SQL> startup
ORACLE instance started.
Total System Global Area 3340451840 bytes
Fixed Size		    2217952 bytes
Variable Size		 1862273056 bytes
Database Buffers	 1459617792 bytes
Redo Buffers		   16343040 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 5 - see DBWR trace file
ORA-01110: data file 5: '/home/oracle/oradata/XIFENFEI.dbf'

该错误一般是由于文件丢失或者路径错误导致

alert日志显示

Sun Apr 07 20:57:03 2019
ALTER DATABASE OPEN
Sun Apr 07 20:57:03 2019
Errors in file /dbdata/oracle/diag/rdbms/orcl/orcl/trace/orcl_dbw0_2681.trc:
ORA-01157: cannot identify/lock data file 5 - see DBWR trace file
ORA-01110: data file 5: '/home/oracle/oradata/XIFENFEI.dbf'
ORA-27092: size of file exceeds file size limit of the process
Additional information: 262144
Additional information: 262145
Errors in file /dbdata/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2802.trc:
ORA-01157: cannot identify/lock data file 5 - see DBWR trace file
ORA-01110: data file 5: '/home/oracle/oradata/XIFENFEI.dbf'
ORA-1157 signalled during: ALTER DATABASE OPEN...
Sun Apr 07 20:57:04 2019
Errors in file /dbdata/oracle/diag/rdbms/orcl/orcl/trace/orcl_m000_2804.trc  (incident=38578):
ORA-00600: internal error code, arguments: [kcidr_io_check_common_6], [10],
     [/home/oracle/oradata/XIFENFEI.dbf], [8192], [2], [5], [], [], [], [], [], []
ORA-27092: size of file exceeds file size limit of the process

这里看到提示ORA-27092: size of file exceeds file size limit of the process
查看系统limit配置

[oracle@XFF ~]$ ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) 640000
scheduling priority             (-e) 0
file size               (blocks, -f) 2097152
pending signals                 (-i) 128489
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 131072
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 131072
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

一般操作系统block size为1k,这里限制文件大小为2097152=(2G)
查看文件

[oracle@XFF ~]$ ls -l /home/oracle/oradata/XIFENFEI.dbf
-rw-r-----. 1 oracle oinstall 2147491840 Apr  7 19:04 /home/oracle/oradata/XIFENFEI.dbf

文件大小为2097160>2097152,导致异常

设置系统对文件大小限制2097152kb

[root@XFF ~]# ulimit -f 102400000
[root@XFF ~]# su - oracle
[oracle@XFF ~]$ ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) 640000
scheduling priority             (-e) 0
file size               (blocks, -f) 102400000
pending signals                 (-i) 128489
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 131072
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 131072
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

重启数据库,open成功

SQL> shutdown immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 3340451840 bytes
Fixed Size		    2217952 bytes
Variable Size		 1862273056 bytes
Database Buffers	 1459617792 bytes
Redo Buffers		   16343040 bytes
Database mounted.
Database opened.

非归档数据库异常恢复一例

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:非归档数据库异常恢复一例

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

由于存储故障,数据库为非归档模式,通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)收集信息确认数据库redo异常
2


Thu Mar 28 11:36:13 2019
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Media Recovery Log /u01/app/oracle/flash_recovery_area/ORCL/archivelog/2019_03_28/o1_mf_1_5397869_%u_.arc
Thu Mar 28 11:36:13 2019
Errors with log /u01/app/oracle/flash_recovery_area/ORCL/archivelog/2019_03_28/o1_mf_1_5397869_%u_.arc
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_pr00_17611.trc:
ORA-00308:cannot open archived log
    '/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2019_03_28/o1_mf_1_5397869_%u_.arc'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Media Recovery Log /u01/app/oracle/flash_recovery_area/ORCL/archivelog/2019_03_28/o1_mf_1_5397869_%u_.arc
Errors with log /u01/app/oracle/flash_recovery_area/ORCL/archivelog/2019_03_28/o1_mf_1_5397869_%u_.arc
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_pr00_17611.trc:
ORA-00308:cannot open archived log
   '/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2019_03_28/o1_mf_1_5397869_%u_.arc'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Mar 28 11:38:44 2019
ALTER DATABASE RECOVER  datafile 5,6
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 3 Seq 5397870 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/orcl/redo03.log
ORA-279 signalled during: ALTER DATABASE RECOVER  datafile 5,6  ...
Thu Mar 28 11:39:08 2019
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Media Recovery Log /u01/app/oracle/flash_recovery_area/ORCL/archivelog/2019_03_28/o1_mf_1_5397870_%u_.arc
Errors with log /u01/app/oracle/flash_recovery_area/ORCL/archivelog/2019_03_28/o1_mf_1_5397870_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER CANCEL
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_17605.trc  (incident=365041):
ORA-00600: internal error code, arguments: [3051], [82], [], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE RECOVER CANCEL ...

通过屏蔽一致性,强制打开库,报kgegpa错误,实例启动失败

Database Characterset is ZHS16GBK
No Resource Manager plan active
Exception[type:SIGSEGV, Address not mapped to object][ADDR:0x319C0CF3] [PC:0x2297740, kgegpa()+40] [flags: 0x0, count:1]
Exception[type:SIGSEGV, Address not mapped to object][ADDR:0x319C0CF3] [PC:0x229596B, kgebse()+279][flags: 0x2, count:2]
Exception[type:SIGSEGV, Address not mapped to object][ADDR:0x319C0CF3] [PC:0x229596B, kgebse()+279][flags: 0x2, count:2]
Thu Mar 28 11:43:15 2019
PMON (ospid: 17939): terminating the instance due to error 397
Instance terminated by PMON, pid = 17939

处理上述错误相关undo,启动数据库报ORA-00600 4193,ORA-00600 4137, ORA-00600 6006

Thu Mar 28 11:50:37 2019
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_p001_18267.trc  (incident=373059):
ORA-00600: internal error code, arguments: [6006], [1], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_373059/orcl_p001_18267_i373059.trc
Stopping background process MMON
Trace dumping is performing id=[cdmp_20190328115038]
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_18247.trc  (incident=372995):
ORA-00600: internal error code, arguments: [6006], [1], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_372995/orcl_smon_18247_i372995.trc
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_18247.trc  (incident=372996):
ORA-00600: internal error code, arguments: [4137], [34.22.4206895], [0], [0], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_372996/orcl_smon_18247_i372996.trc
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_18263.trc  (incident=373044):
ORA-00600: internal error code, arguments: [4193], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_373044/orcl_ora_18263_i373044.trc
ORACLE Instance orcl (pid = 16) - Error 600 encountered while recovering transaction (34, 22).
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_18247.trc:
ORA-00600: internal error code, arguments: [4137], [34.22.4206895], [0], [0], [], [], [], [], [], [], [], []

通过重建undo,相关报错消失,安排数据导出重建库

ORA-00322 ORA-00312恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-00322 ORA-00312恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动ORA-00322 ORA-00312错误,无法正常启动

Fri Mar 29 17:44:20 2019
ALTER DATABASE RECOVER  datafile 1
Media Recovery Start
Serial Media Recovery started
Media Recovery failed with error 19909
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 1  ...
Fri Mar 29 17:44:20 2019
Errors in file c:\app\administrator\diag\rdbms\xff\xff\trace\xff_m000_5392.trc:
ORA-00322: log 1 of thread 1 is not current copy
ORA-00312: online log 1 thread 1: 'D:\APP\ADMINISTRATOR\ORADATA\xff\REDO01.LOG'
Errors in file c:\app\administrator\diag\rdbms\xff\xff\trace\xff_m000_5392.trc:
ORA-00322: log 2 of thread 1 is not current copy
ORA-00312: online log 2 thread 1: 'D:\APP\ADMINISTRATOR\ORADATA\xff\REDO02.LOG'
Errors in file c:\app\administrator\diag\rdbms\xff\xff\trace\xff_m000_5392.trc:
ORA-00322: log 3 of thread 1 is not current copy
ORA-00312: online log 3 thread 1: 'D:\APP\ADMINISTRATOR\ORADATA\xff\REDO03.LOG'

人工指定redo应用,报ORA-00600 3051错误

Fri Mar 29 17:56:33 2019
ALTER DATABASE RECOVER  datafile 1
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 2 Seq 27542 Reading mem 0
  Mem# 0: D:\XIFENFEI\REDO02.LOG
ORA-279 signalled during: ALTER DATABASE RECOVER  datafile 1  ...
Fri Mar 29 17:56:49 2019
ALTER DATABASE RECOVER    LOGFILE 'D:\xifenfei\REDO02.log'
Media Recovery Log D:\xifenfei\REDO02.log
Errors with log D:\xifenfei\REDO02.log
ORA-363 signalled during: ALTER DATABASE RECOVER    LOGFILE 'D:\xifenfei\REDO02.log'  ...
ALTER DATABASE RECOVER CANCEL
Errors in file c:\app\administrator\diag\rdbms\xff\xff\trace\xff_ora_8532.trc  (incident=147928):
ORA-00600: ??????, ??: [3051], [82], [], [], [], [], [], [], [], [], [], []
Incident details in: c:\app\administrator\diag\rdbms\xff\xff\incident\incdir_147928\xff_ora_8532_i147928.trc

比较明显redo无法正常应用,通过屏蔽数据库一致性,强制拉库

Recovery of Online Redo Log: Thread 1 Group 1 Seq 1 Reading mem 0
  Mem# 0: D:\XIFENFEI\REDO01.LOG
Block recovery stopped at EOT rba 1.76.16
Block recovery completed at rba 1.76.16, scn 0.1073742057
Doing block recovery for file 3 block 272
Resuming block recovery (PMON) for file 3 block 272
Block recovery from logseq 1, block 72 to scn 1073742051
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1 Reading mem 0
  Mem# 0: D:\XIFENFEI\REDO01.LOG
Block recovery completed at rba 1.72.16, scn 0.1073742052
Errors in file c:\app\administrator\diag\rdbms\xff\xff\trace\xff_smon_5144.trc:
ORA-01595: error freeing extent (16) of rollback segment (10))
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
Fri Mar 29 17:59:12 2019
Errors in file c:\app\administrator\diag\rdbms\xff\xff\trace\xff_mmon_13928.trc  (incident=149097):
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
Incident details in: c:\app\administrator\diag\rdbms\xff\xff\incident\incdir_149097\xff_mmon_13928_i149097.trc
Fri Mar 29 17:59:12 2019
Trace dumping is performing id=[cdmp_20190329175912]
Completed: alter database open resetlogs

通过重建undo,数据库open正常,安排导出数据导入数据,恢复完成

dul支持Oracle 19C

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:dul支持Oracle 19C

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

Oracle 19C

[oracle@localhost ~]$ ss
SQL*Plus: Release 19.0.0.0.0 - Production on Sat Mar 2 07:02:18 2019
Version 19.2.0.0.0
Copyright (c) 1982, 2018, Oracle.  All rights reserved.
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.2.0.0.0
SQL> select name from v$datafile where rownum<3;
NAME
--------------------------------------------------------------------------------
/u01/app/oracle/oradata/ORA19C/system01.dbf
/u01/app/oracle/oradata/ORA19C/sysaux01.dbf

dul支持19C

[root@localhost dul]# ./dul
Data UnLoader: 11.3.0.0.2 - Internal Only - on Sat Mar  2 08:26:11 2019
with 64-bit io functions and the decompression option
Copyright (c) 1994 2018 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
DUL: Warning: Recreating file "dul.log"
Found db_id = 1054612630
Found db_name = ORA19C
DUL>  bootstrap;
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 12
 OBJ$: segobjno 18, file 1 block 240
 TAB$: segobjno 2, tabno 1, file 1  block 144
 COL$: segobjno 2, tabno 5, file 1  block 144
 USER$: segobjno 10, tabno 1, file 1  block 208
 TABPART$: segobjno 814, file 1 block 5424
 INDPART$: segobjno 819, file 1 block 5464
 TABCOMPART$: segobjno 836, file 1 block 5600
 INDCOMPART$: segobjno 841, file 1 block 5640
 TABSUBPART$: segobjno 826, file 1 block 5520
 INDSUBPART$: segobjno 831, file 1 block 5560
 IND$: segobjno 2, tabno 3, file 1  block 144
 ICOL$: segobjno 2, tabno 4, file 1  block 144
 LOB$: segobjno 2, tabno 6, file 1  block 144
 COLTYPE$: segobjno 2, tabno 7, file 1  block 144
 TYPE$: segobjno 740, tabno 1, file 1  block 4888
 COLLECTION$: segobjno 740, tabno 2, file 1  block 4888
 ATTRIBUTE$: segobjno 740, tabno 3, file 1  block 4888
 LOBFRAG$: segobjno 847, file 1 block 5688
 LOBCOMPPART$: segobjno 850, file 1 block 5720
 UNDO$: segobjno 15, file 1 block 224
 TS$: segobjno 6, tabno 2, file 1  block 176
 PROPS$: segobjno 127, file 1 block 1320
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$
DUL: Warning: Recreating file "OBJ.ctl"
   72388 rows unloaded
. unloading table                      TAB$
DUL: Warning: Recreating file "TAB.ctl"
    2218 rows unloaded
. unloading table                      COL$
DUL: Warning: Recreating file "COL.ctl"
  123175 rows unloaded
. unloading table                     USER$
DUL: Warning: Recreating file "USER.ctl"
     127 rows unloaded
. unloading table                  TABPART$
DUL: Warning: Recreating file "TABPART.ctl"
     294 rows unloaded
. unloading table                  INDPART$
DUL: Warning: Recreating file "INDPART.ctl"
     195 rows unloaded
. unloading table               TABCOMPART$
DUL: Warning: Recreating file "TABCOMPART.ctl"
       1 row  unloaded
. unloading table               INDCOMPART$
DUL: Warning: Recreating file "INDCOMPART.ctl"
       0 rows unloaded
. unloading table               TABSUBPART$
DUL: Warning: Recreating file "TABSUBPART.ctl"
      32 rows unloaded
. unloading table               INDSUBPART$
DUL: Warning: Recreating file "INDSUBPART.ctl"
       0 rows unloaded
. unloading table                      IND$
DUL: Warning: Recreating file "IND.ctl"
    2878 rows unloaded
. unloading table                     ICOL$
DUL: Warning: Recreating file "ICOL.ctl"
    4958 rows unloaded
. unloading table                      LOB$
DUL: Warning: Recreating file "LOB.ctl"
     678 rows unloaded
. unloading table                  COLTYPE$
DUL: Warning: Recreating file "COLTYPE.ctl"
    2999 rows unloaded
. unloading table                     TYPE$
DUL: Warning: Recreating file "TYPE.ctl"
    5895 rows unloaded
. unloading table               COLLECTION$
DUL: Warning: Recreating file "COLLECTION.ctl"
    1384 rows unloaded
. unloading table                ATTRIBUTE$
DUL: Warning: Recreating file "ATTRIBUTE.ctl"
   15365 rows unloaded
. unloading table                  LOBFRAG$
DUL: Warning: Recreating file "LOBFRAG.ctl"
      14 rows unloaded
. unloading table              LOBCOMPPART$
DUL: Warning: Recreating file "LOBCOMPPART.ctl"
       0 rows unloaded
. unloading table                     UNDO$
DUL: Warning: Recreating file "UNDO.ctl"
      21 rows unloaded
. unloading table                       TS$
DUL: Warning: Recreating file "TS.ctl"
       6 rows unloaded
. unloading table                    PROPS$
DUL: Warning: Recreating file "PROPS.ctl"
      42 rows unloaded
Reading USER.dat 127 entries loaded
Reading OBJ.dat 72388 entries loaded and sorted 72388 entries
Reading TAB.dat 2201 entries loaded
Reading COL.dat
DUL: Notice: Increased the size of DC_COLUMNS from 100000 to 132768 entries
 123148 entries loaded and sorted 123148 entries
Reading TABPART.dat 294 entries loaded and sorted 294 entries
Reading TABCOMPART.dat 1 entries loaded and sorted 1 entries
Reading TABSUBPART.dat 32 entries loaded and sorted 32 entries
Reading INDPART.dat 195 entries loaded and sorted 195 entries
Reading INDCOMPART.dat 0 entries loaded and sorted 0 entries
Reading INDSUBPART.dat 0 entries loaded and sorted 0 entries
Reading IND.dat 2878 entries loaded
Reading LOB.dat 678 entries loaded
Reading ICOL.dat 4958 entries loaded
Reading COLTYPE.dat 2999 entries loaded
Reading TYPE.dat 5895 entries loaded
Reading ATTRIBUTE.dat 15365 entries loaded
Reading COLLECTION.dat 1384 entries loaded
Reading BOOTSTRAP.dat 60 entries loaded
Reading LOBFRAG.dat 14 entries loaded and sorted 14 entries
Reading LOBCOMPPART.dat 0 entries loaded and sorted 0 entries
Reading UNDO.dat 21 entries loaded
Reading TS.dat 6 entries loaded
Reading PROPS.dat 42 entries loaded
DUL> desc sys.tab$;
Table SYS.TAB$
obj#= 4, dataobj#= 2, ts#= 0, file#= 1, block#=144
      tab#= 1, segcols= 45, clucols= 1
Column information:
icol# 01 segcol# 01         OBJ# len   22 type  2 NUMBER(0)
icol# 02 segcol# 02     DATAOBJ# len   22 type  2 NUMBER(0)
icol# 03 segcol# 03          TS# len   22 type  2 NUMBER(0)
icol# 04 segcol# 04        FILE# len   22 type  2 NUMBER(0)
icol# 05 segcol# 05       BLOCK# len   22 type  2 NUMBER(0)
icol# 06 segcol# 06        BOBJ# len   22 type  2 NUMBER(0)
icol# 07 segcol# 07         TAB# len   22 type  2 NUMBER(0)
icol# 08 segcol# 08         COLS len   22 type  2 NUMBER(0)
icol# 09 segcol# 09      CLUCOLS len   22 type  2 NUMBER(0)
icol# 10 segcol# 10     PCTFREE$ len   22 type  2 NUMBER(0)
icol# 11 segcol# 11     PCTUSED$ len   22 type  2 NUMBER(0)
icol# 12 segcol# 12     INITRANS len   22 type  2 NUMBER(0)
icol# 13 segcol# 13     MAXTRANS len   22 type  2 NUMBER(0)
icol# 14 segcol# 14        FLAGS len   22 type  2 NUMBER(0)
icol# 15 segcol# 15       AUDIT$ len   38 type  1 VARCHAR2 cs 852(ZHS16GBK)
icol# 16 segcol# 16       ROWCNT len   22 type  2 NUMBER(0)
icol# 17 segcol# 17       BLKCNT len   22 type  2 NUMBER(0)
icol# 18 segcol# 18       EMPCNT len   22 type  2 NUMBER(0)
icol# 19 segcol# 19       AVGSPC len   22 type  2 NUMBER(0)
icol# 20 segcol# 20       CHNCNT len   22 type  2 NUMBER(0)
icol# 21 segcol# 21       AVGRLN len   22 type  2 NUMBER(0)
icol# 22 segcol# 22   AVGSPC_FLB len   22 type  2 NUMBER(0)
icol# 23 segcol# 23       FLBCNT len   22 type  2 NUMBER(0)
icol# 24 segcol# 24  ANALYZETIME len    7 type 12 DATE
icol# 25 segcol# 25   SAMPLESIZE len   22 type  2 NUMBER(0)
icol# 26 segcol# 26       DEGREE len   22 type  2 NUMBER(0)
icol# 27 segcol# 27    INSTANCES len   22 type  2 NUMBER(0)
icol# 28 segcol# 28      INTCOLS len   22 type  2 NUMBER(0)
icol# 29 segcol# 29   KERNELCOLS len   22 type  2 NUMBER(0)
icol# 30 segcol# 30     PROPERTY len   22 type  2 NUMBER(0)
icol# 31 segcol# 31     TRIGFLAG len   22 type  2 NUMBER(0)
icol# 32 segcol# 32       SPARE1 len   22 type  2 NUMBER(0)
icol# 33 segcol# 33       SPARE2 len   22 type  2 NUMBER(0)
icol# 34 segcol# 34       SPARE3 len   22 type  2 NUMBER(0)
icol# 35 segcol# 35       SPARE4 len 1000 type  1 VARCHAR2 cs 852(ZHS16GBK)
icol# 36 segcol# 36       SPARE5 len 1000 type  1 VARCHAR2 cs 852(ZHS16GBK)
icol# 37 segcol# 37       SPARE6 len    7 type 12 DATE
icol# 38 segcol# 38       SPARE7 len   22 type  2 NUMBER(0)
icol# 39 segcol# 39       SPARE8 len   22 type  2 NUMBER(0)
icol# 40 segcol# 40       SPARE9 len 1000 type  1 VARCHAR2 cs 852(ZHS16GBK)
icol# 41 segcol# 41      SPARE10 len 1000 type  1 VARCHAR2 cs 852(ZHS16GBK)
icol# 42 segcol# 42    ACDRFLAGS len   22 type  2 NUMBER(0)
icol# 43 segcol# 43   ACDRTSOBJ# len   22 type  2 NUMBER(0)
icol# 44 segcol# 44 ACDRDEFAULTTIME len   11 type 180 TIMESTAMP(9)
icol# 45 segcol# 45 ACDRROWTSINTCOL# len   22 type  2 NUMBER(0)

可能遇到错误

Reading TAB.dat
DUL: Error: string2ub8(618970019642690137449563136), Conversion to number (ub8) overflowed
DUL: Error: Number conversion error in file TAB.dat, line 22
DUL: Warning: Ignoring file TAB.dat cache
Reading COL.dat
DUL: Error: string2ub8(73786976294838206464), Conversion to number (ub8) overflowed
DUL: Error: Number conversion error in file COL.dat, line 114321
DUL: Warning: Ignoring file COL.dat cache

ORA-600 kcbzib_kcrsds_1报错

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600 kcbzib_kcrsds_1报错

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库版本
12.2
客户存储故障,修复之后,多套库增加_allow_resetlogs_corruption隐含参数强制拉库出现都类似错误ORA-600 kcbzib_kcrsds_1,出现这个错误,一般都是由于数据库不一致强制拉库导致

2019-02-23T01:25:43.125621+08:00
 alter database open resetlogs
2019-02-23T01:25:43.231990+08:00
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 149251865354 time
Clearing online redo logfile 1 +DG_XFF/xifenfei/ONLINELOG/group_1.258.983824407
Clearing online redo logfile 2 +DG_XFF/xifenfei/ONLINELOG/group_2.259.983824409
Clearing online redo logfile 3 +DG_XFF/xifenfei/ONLINELOG/group_3.266.983825461
Clearing online redo logfile 4 +DG_XFF/xifenfei/ONLINELOG/group_4.267.983825461
Clearing online log 1 of thread 1 sequence number 20749
Clearing online log 2 of thread 1 sequence number 20750
Clearing online log 3 of thread 2 sequence number 1371
Clearing online log 4 of thread 2 sequence number 1372
2019-02-23T01:25:44.669890+08:00
ALTER SYSTEM SET remote_listener=' xifenfeidb-cluster-scan:1521' SCOPE=MEMORY SID='xifenfei2';
2019-02-23T01:25:44.671436+08:00
ALTER SYSTEM SET listener_networks='' SCOPE=MEMORY SID='xifenfei2';
2019-02-23T01:25:46.990077+08:00
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 complete
Clearing online redo logfile 3 complete
Clearing online redo logfile 4 complete
Resetting resetlogs activation ID 3002369299 (0xb2f48513)
Online log +DG_XFF/xifenfei/ONLINELOG/group_1.258.983824407: Thread 1 Group 1 was previously cleared
Online log +DG_XFF/xifenfei/ONLINELOG/group_2.259.983824409: Thread 1 Group 2 was previously cleared
Online log +DG_XFF/xifenfei/ONLINELOG/group_3.266.983825461: Thread 2 Group 3 was previously cleared
Online log +DG_XFF/xifenfei/ONLINELOG/group_4.267.983825461: Thread 2 Group 4 was previously cleared
2019-02-23T01:25:47.137701+08:00
Setting recovery target incarnation to 2
2019-02-23T01:25:47.152393+08:00
This instance was first to open
Ping without log force is disabled:
  not an Exadata system.
Picked broadcast on commit scheme to generate SCNs
Endian type of dictionary set to little
2019-02-23T01:25:47.597502+08:00
Assigning activation ID 3019587675 (0xb3fb405b)
2019-02-23T01:25:47.625734+08:00
TT00: Gap Manager starting (PID:22467)
2019-02-23T01:25:47.910026+08:00
Thread 2 opened at log sequence 1
  Current log# 3 seq# 1 mem# 0: +DG_XFF/xifenfei/ONLINELOG/group_3.266.983825461
Successful open of redo thread 2
2019-02-23T01:25:47.911069+08:00
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
2019-02-23T01:25:47.971709+08:00
Sleep 5 seconds and then try to clear SRLs in 2 time(s)
2019-02-23T01:25:48.065008+08:00
start recovery: pdb 0, passed in flags x10 (domain enable 0)
2019-02-23T01:25:48.065177+08:00
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
validate pdb 0, flags x10, valid 0, pdb flags x84
* validated domain 0, flags = 0x80
Instance recovery complete: valid 1 (flags x10, recovery domain flags x80)
2019-02-23T01:25:48.803746+08:00
Errors in file /oracle/base/oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_ora_21292.trc  (incident=128552):
ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2019-02-23T01:25:49.947062+08:00
*****************************************************************
An internal routine has requested a dump of selected redo.
This usually happens following a specific internal error, when
analysis of the redo logs will help Oracle Support with the
diagnosis.
It is recommended that you retain all the redo logs generated (by
all the instances) during the past 12 hours, in case additional
redo dumps are required to help with the diagnosis.
*****************************************************************
2019-02-23T01:25:50.334684+08:00
Errors in file /oracle/base/oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_ora_21292.trc:
ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [], [], [], [], [], [], [], []
2019-02-23T01:25:50.334880+08:00
Errors in file /oracle/base/oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_ora_21292.trc:
ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
2019-02-23T01:25:50.362808+08:00
Errors in file /oracle/base/oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_ora_21292.trc  (incident=128553):
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [], [], [], [], [], [], [], []
2019-02-23T01:25:51.521133+08:00
opiodr aborting process unknown ospid (21292) as a result of ORA-603

另外出现该错误之后,数据库再次恢复会出现类似,这个是由于open库的过程中导致控制文件损坏而出现的错误.

2019-02-23T22:52:39.390966+08:00
ALTER DATABASE RECOVER  database
2019-02-23T22:52:39.391125+08:00
Media Recovery Start
 Started logmerger process
2019-02-23T22:52:39.471904+08:00
Media Recovery failed with error 16433
2019-02-23T22:52:39.996235+08:00
Errors in file /oracle/base/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_m000_1593.trc:
ORA-01110: data file 11: '+DG_XFF/xifenfei/DATAFILE/ls_zh.274.984235699'
2019-02-23T22:52:40.224440+08:00
Errors in file /oracle/base/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_m000_1593.trc:
ORA-01110: data file 12: '+DG_XFF/xifenfei/DATAFILE/sf_zh.275.984235715'
2019-02-23T22:52:40.459606+08:00
Errors in file /oracle/base/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_m000_1593.trc:
ORA-01110: data file 13: '+DG_XFF/xifenfei/DATAFILE/tj_gl.276.984235729'
2019-02-23T22:52:40.574563+08:00
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database  ...

ORA-01205: not a data file – type number in header is 0

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-01205: not a data file – type number in header is 0

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库open成功,但是有undo文件异常报ORA-00376

Mon Feb 04 07:43:18 中国标准时间 2019
Completed: alter database open
Mon Feb 04 07:43:18 中国标准时间 2019
ALTER SYSTEM disable restricted session;
Mon Feb 04 07:43:19 中国标准时间 2019
ORA-376 encountered when generating server alert SMG-4120
Mon Feb 04 07:43:19 中国标准时间 2019
Errors in file d:\oracle\xff\xff\background\xff_cjq0_5428.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 2 cannot be read at this time
ORA-01110: data file 2: 'K:\ORACLE\SAP\SAPDATA1\UNDO_1\UNDO.DATA1'

检查文件状态
1
wrong-file-type


恢复之后报错
数据库无法正常open,提示ORA-07445 _intel_fast_memset.A和ORA-01205错误

Wed Feb 06 12:51:01 中国标准时间 2019
ALTER DATABASE RECOVER  datafile 2
Wed Feb 06 12:51:01 中国标准时间 2019
Media Recovery Start
Read of rdba: 0x00800001 (file 2, block 1) failed with ORA-01205.
Trying reread from disk.
Reread of rdba: 0x00800001 (file 2, block 1) failed with ORA-01205
Wed Feb 06 12:51:01 中国标准时间 2019
Errors in file d:\oracle\xff\xff\usertrace\xff_ora_6988.trc:
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [_intel_fast_memset.A+44]
                                    [PC:0x356EA7C] [ADDR:0x7F20000] [UNABLE_TO_WRITE] []
ORA-01205: not a data file - type number in header is 0
Wed Feb 06 12:52:14 中国标准时间 2019
alter database open
Wed Feb 06 12:52:15 中国标准时间 2019
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=17, OS id=6124
Wed Feb 06 12:52:15 中国标准时间 2019
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=18, OS id=7044
Wed Feb 06 12:52:15 中国标准时间 2019
Thread 1 opened at log sequence 1370107
  Current log# 2 seq# 1370107 mem# 0: D:\ORACLE\xff\ORIGLOGB\LOG_G12M1.DBF
  Current log# 2 seq# 1370107 mem# 1: E:\ORACLE\xff\MIRRLOGB\LOG_G12M2.DBF
Successful open of redo thread 1
Wed Feb 06 12:52:15 中国标准时间 2019
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Feb 06 12:52:15 中国标准时间 2019
ARC0: STARTING ARCH PROCESSES
Wed Feb 06 12:52:15 中国标准时间 2019
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
Wed Feb 06 12:52:15 中国标准时间 2019
ARC2: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ARC0: Becoming the heartbeat ARCH
ARC2 started with pid=19, OS id=3772
Wed Feb 06 12:52:15 中国标准时间 2019
SMON: enabling cache recovery
Wed Feb 06 12:52:15 中国标准时间 2019
Read of rdba: 0x00800001 (file 2, block 1) failed with ORA-01205.
Trying reread from disk.
Reread of rdba: 0x00800001 (file 2, block 1) failed with ORA-01205
Wed Feb 06 12:52:15 中国标准时间 2019
Errors in file d:\oracle\xff\xff\usertrace\xff_ora_5036.trc:
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [_intel_fast_memset.A+44]
                                       [PC:0x356EA7C] [ADDR:0xD5F0000] [UNABLE_TO_WRITE] []
ORA-01205: not a data file - type number in header is 0

2


恢复成功
有备份,但是因为备份集有损坏无法正常还原,运气不错,通过人工操作正常恢复datafile 2并且正常recover成功,数据库正常open成功

Sat Feb 09 10:09:40 中国标准时间 2019
Successfully onlined Undo Tablespace 1.
Sat Feb 09 10:09:40 中国标准时间 2019
SMON: enabling tx recovery
Sat Feb 09 10:09:40 中国标准时间 2019
Database Characterset is UTF8
Opening with internal Resource Manager plan
Starting background process QMNC
QMNC started with pid=36, OS id=3888
Sat Feb 09 10:09:42 中国标准时间 2019
Completed: alter database open