数据库启动报ORA-600 6711故障分析处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:数据库启动报ORA-600 6711故障分析处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

几个月以前的一个数据库故障,今天拿出来在win上重新分析,数据库启动报ORA-600 6711错

C:\Users\XFF>SQLPLUS / AS SYSDBA

SQL*Plus: Release 12.1.0.2.0 Production on 星期日 7月 14 16:17:32 2024

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

已连接到空闲例程。

SQL> startup mount pfile='d:/pfile.txt'
ORACLE 例程已经启动。

Total System Global Area 6442450944 bytes
Fixed Size                  6205768 bytes
Variable Size            1493175992 bytes
Database Buffers         4932501504 bytes
Redo Buffers               10567680 bytes
数据库装载完毕。
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [6711], [4436379], [1], [4436389],
[0], [], [], [], [], [], [], []
进程 ID: 44144
会话 ID: 67 序列号: 39084

根据经验该报错为:ORA-600 [6711] “Cluster Key Chain corruption”,也就是说很可能是cluster相关对象异常导致该问题.


对启动过程进行跟踪

PARSING IN CURSOR #17695456 len=189 dep=4  tim=233428646426 hv=186852205 ad='7ffda1eea168' sqlid='2tkw12w5k68vd'
select user#,password,datats#,tempts#,type#,defrole,resource$, ptime,
decode(defschclass,NULL,'DEFAULT_CONSUMER_GROUP',defschclass),
spare1,spare4,ext_username,spare2 from user$ where name=:1
END OF STMT
PARSE #17695456:c=0,e=168,p=0,cr=0,cu=0,mis=1,r=0,dep=4,og=4,plh=0,tim=233428646426
BINDS #17695456:
 Bind#0
  oacdty=01 mxl=32(03) mxlc=00 mal=00 scl=00 pre=00
  oacflg=18 fl2=0001 frm=01 csi=871 siz=32 off=0
  kxsbbbfp=010b2df0  bln=32  avl=03  flg=05
  value="SYS"
EXEC #17695456:c=0,e=418,p=0,cr=0,cu=0,mis=1,r=0,dep=4,og=4,plh=1457651150,tim=233428646901
WAIT #17695456: nam='db file sequential read' ela= 126 file#=1 block#=417 blocks=1 obj#=46 tim=233428647046
FETCH #17695456:c=0,e=153,p=1,cr=2,cu=0,mis=0,r=1,dep=4,og=4,plh=1457651150,tim=233428647069
STAT #17695456 id=1 cnt=1 pid=0 pos=1 obj=22 op='TABLE ACCESS BY INDEX ROWID USER$ 
  (cr=2 pr=1 pw=0 time=151 us cost=1 size=139 card=1)'
STAT #17695456 id=2 cnt=1 pid=1 pos=1 obj=46 op='INDEX UNIQUE SCAN I_USER1 (cr=1 pr=1 pw=0 time=149 us)'
CLOSE #17695456:c=0,e=2,dep=4,type=0,tim=233428647111
Incident 2601 created, dump file: C:\APP\XFF\diag\rdbms\ecp\ecp\incident\incdir_2601\ecp_ora_40516_i2601.trc
ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []

FETCH #15289752:c=2062500,e=2544215,p=13,cr=65626,cu=28,mis=0,r=0,dep=3,og=3,plh=3312420081,tim=233431176536
=====================
PARSE ERROR #387363008:len=50 dep=1 uid=0 oct=3 lid=0 tim=233431176680 err=600
select cost from resource_cost$ where resource#=:1
ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []
ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []

这个操作触发了递归查询

PARSING IN CURSOR #387319440 len=151 dep=5 lid=0 tim=233428641503 hv=2507062328 ad='7ffd9ffa23a8' sqlid='7u49y06aqxg1s'
select /*+ rule */ bucket, endpoint, col#, epvalue, epvalue_raw, ep_repeat_count from histgrm$ 
where obj#=:1 and intcol#=:2 and row#=:3 order by bucket
END OF STMT
PARSE #387319440:c=0,e=11,p=0,cr=0,cu=0,mis=0,r=0,dep=5,og=3,plh=3312420081,tim=233428641503
BINDS #387319440:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000001 frm=00 csi=00 siz=72 off=0
  kxsbbbfp=00eb2be0  bln=22  avl=02  flg=05
  value=22
 Bind#1
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000001 frm=00 csi=00 siz=0 off=24
  kxsbbbfp=00eb2bf8  bln=22  avl=02  flg=01
  value=2
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000001 frm=00 csi=00 siz=0 off=48
  kxsbbbfp=00eb2c10  bln=22  avl=01  flg=01
  value=0
EXEC #387319440:c=0,e=105,p=0,cr=0,cu=0,mis=0,r=0,dep=5,og=3,plh=3312420081,tim=233428641652
WAIT #387319440: nam='db file sequential read' ela= 124 file#=1 block#=45660 blocks=1 obj#=66 tim=233428641792
FETCH #387319440:c=0,e=173,p=1,cr=3,cu=0,mis=0,r=20,dep=5,og=3,plh=3312420081,tim=233428641834
STAT #387319440 id=1 cnt=20 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=3 pr=1 pw=0 time=169 us cost=0 size=0 card=0)'
STAT #387319440 id=2 cnt=20 pid=1 pos=1 obj=66 op='TABLE ACCESS CLUSTER HISTGRM$ (cr=3 pr=1 pw=0 time=148 us)'
STAT #387319440 id=3 cnt=1 pid=2 pos=1 obj=65 op='INDEX UNIQUE SCAN I_OBJ#_INTCOL# (cr=2 pr=0 pw=0 time=2 us)'
CLOSE #387319440:c=0,e=36,dep=5,type=3,tim=233428641886

查看对应的trace文件

[TOC00000]
Jump to table of contents
Dump continued from file: C:\APP\XFF\diag\rdbms\ecp\ecp\trace\ecp_ora_40516.trc
[TOC00001]
ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []

[TOC00001-END]
[TOC00002]
========= Dump for incident 2601 (ORA 600 [6711]) ========
[TOC00003]
----- Beginning of Customized Incident Dump(s) -----
kdsDumpState: cdb: 0 dspdb: 0 type: 3
*** ENTER: kds state dump ***
            row 0x0043b1a5.28 continuation at: 0x0043b1a5.0 file# 1 block# 242085 slot 0 (dscnt: 0)
KDSTABN_GET: 1 ..... ntab: 2
curSlot: 0 ..... nrows: 40
Dumping kcb descriptor:
kcbds 0x0000000017100DF0 : tsn 0, rdba 0x0043b1a5, afn 1, objd 64, cls 1, tidflg 0x0 0x0 0x0
    dsflg 0x00100000, dsflg2 0x00004000, lobid 00000000:00000000, cnt 0, addr 0x00007FFD55D1C014 dx 0x0000000000000000
    env [0x0000000017178C7C]: (scn: 0x0000.54290647   xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  
    statement num=0  parent xid:  0x0000.000.00000000  st-scn: 0x0000.00000000  
    hi-scn: 0x0000.00000000  ma-scn: 0x0000.00000000  flg: 0x00000660)
kcb_dw_scan_dumpctx: not in DW scan
kdsgrp1_dump database not fully open
*** EXIT: kds state dump ***
----- End of Customized Incident Dump(s) -----
[TOC00003-END]

通过对相关rdba进行dump分析,确认对象id为64和trace中报的信息匹配

DUL> rdba 0x0043b1a5

  rdba   : 0x0043b1a5=4436389
  rfile# : 1
  block# : 242085

DUL> dump datafile 1 block 242085 header
Block Header:
block type=0x06 (table/index/cluster segment data block)
block format=0xa2 (oracle 10)
block rdba=0x0043b1a5 (file#=1, block#=242085)
scn=0x0000.438d4a86, seq=1, tail=0x4a860601
block checksum value=0xd591=54673, flag=6
Data Block Header Dump:
 Object id on Block? Y
 seg/obj: 0x40=64  csc: 0x00.438d4a80  itc: 2  flg: -  typ: 1 (data)
     fsl: 0  fnx: 0x0 ver: 0x01

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0002.01f.00014b92  0x00c01897.6e20.07  C---    0  scn 0x0000.438c5fca
0x02   0x000a.01a.0011bb8e  0x00c0292c.0317.42  --U-   22  fsc 0x0000.438d4a86
Data Block Dump:
================
flag=0x0 --------
ntab=2
nrow=41
frre=23
fsbo=0x68
ffeo=0xb90
avsp=0x1ce1
tosp=0x1ce1

进一步分析该id为什么对象,使用dul unload obj$
c_obj#_intcol#


确认对对象为cluster C_OBJ#_INTCOL#,对应的表为HISTGRM$(统计信息中存储直方图信息表),明白这一些,处理起来就比较容易了,open数据库过程中绕过该对象访问,然后对该表进行处理即可

RMAN-06207 RMAN-06208报错解决

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:RMAN-06207 RMAN-06208报错解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库在rman备份时执行delete noprompt expired backup报RMAN-06207和RMAN-06208错误

RMAN>delete noprompt expired backup;
RMAN retention policy will be applied to the command
RMAN retention policy is set to recovery window of 1 days
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1364 instance=hisdb1 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=3102 instance=hisdb1 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=4534 instance=hisdb1 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=5054 instance=hisdb1 device type=DISK
Deleting the following obsolete backups and copies:
Type                 Key    Completion Time    Filename/Handle
-------------------- ------ ------------------ --------------------
Datafile Copy        3      15-JUN-18          +DATA/hisdb/datafile/xifenfei.282.978867569
Datafile Copy        4      15-JUN-18          +DATA/hisdb/datafile/xifenfei.283.978867589
Backup Set           54704  14-JUN-24
  Backup Piece       54704  14-JUN-24          /oraback/rmanfile/ctl_20240614_m22ta31a_1_1.rman

RMAN-06207: WARNING: 3 objects could not be deleted for DISK channel(s) due
RMAN-06208:          to mismatched status.  Use CROSSCHECK command to fix status
RMAN-06210: List of Mismatched objects
RMAN-06211: ==========================
RMAN-06212:   Object Type   Filename/Handle
RMAN-06213: --------------- ---------------------------------------------------
RMAN-06214: Datafile Copy   +DATA/hisdb/datafile/xifenfei.282.978867569
RMAN-06214: Datafile Copy   +DATA/hisdb/datafile/xifenfei.283.978867589
RMAN-06214: Backup Piece    /oraback/rmanfile/ctl_20240614_m22ta31a_1_1.rman

处理方法crosscheck copy

RMAN> crosscheck copy;

using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1364 instance=hisdb1 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=3102 instance=hisdb1 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=4534 instance=hisdb1 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=5070 instance=hisdb1 device type=DISK
specification does not match any control file copy in the repository
validation succeeded for archived log
archived log file name=+FRA/hisdb/archivelog/2024_06_16/thread_2_seq_149520.1675.1171833177 
     RECID=589464 STAMP=1171833177
Crosschecked 1 objects

validation failed for datafile copy
datafile copy file name=+DATA/hisdb/datafile/xifenfei.282.978867569 RECID=3 STAMP=978867571
validation failed for datafile copy
datafile copy file name=+DATA/hisdb/datafile/xifenfei.283.978867589 RECID=4 STAMP=978867590
Crosschecked 2 objects

再次验证删除不再提示错误

RMAN>   DELETE noprompt OBSOLETE;

RMAN retention policy will be applied to the command
RMAN retention policy is set to recovery window of 1 days
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
no obsolete backups found

O/S-Error: (OS 23) 数据错误(循环冗余检查)—故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:O/S-Error: (OS 23) 数据错误(循环冗余检查)—故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户由于磁盘坏道导致数据文件访问报ORA-27070 OSD-04016 O/S-Error等相关错误
OSD-04016


rman 尝试读取88号文件

RMAN> backup datafile 88 format 'e:\rman\%d_%T_%I.%s%p';

启动 backup 于 05-6月 -24
使用目标数据库控制文件替代恢复目录
分配的通道: ORA_DISK_1
通道 ORA_DISK_1: SID=2246 设备类型=DISK
通道 ORA_DISK_1: 正在启动全部数据文件备份集
通道 ORA_DISK_1: 正在指定备份集内的数据文件
输入数据文件: 文件号=00088 名称=E:\APP\ADMINISTRATOR\ORADATA\XFF\XIFENFEI.ORA
通道 ORA_DISK_1: 正在启动段 1 于 05-6月 -24
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: backup 命令 (ORA_DISK_1 通道上, 在 06/05/2024 18:33:43 上) 失败
ORA-19501: 文件 "E:\APP\ADMINISTRATOR\ORADATA\XFF\XIFENFEI.ORA", 块编号 322944 (块大小=8192) 上出现读取错误
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 23) 数据错误(循环冗余检查)。

检查系统日志,发现有报:设备 \Device\Harddisk0\DR0 有一个不正确的区块
disk-error


到目前为止基本上可以判断是文件系统或者磁盘层面出现坏道导致该问题(磁盘坏道概率更大),使用工具对损坏数据文件进行强制拷贝,提示少量扇区数据无法拷贝
force-copy

通过dbv检查恢复出来文件效果

E:\check_db>DBV FILE=D:/XIFENFEI.ORA

DBVERIFY: Release 11.2.0.3.0 - Production on 星期日 6月 9 22:17:28 2024

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - 开始验证: FILE = D:\XIFENFEI.ORA
页 323055 标记为损坏
Corrupt block relative dba: 0x1604edef (file 88, block 323055)
Bad header found during dbv:
Data in bad block:
 type: 229 format: 5 rdba: 0xe5e5e5e5
 last change scn: 0xe5e5.e5e5e5e5 seq: 0xe5 flg: 0xe5
 spare1: 0xe5 spare2: 0xe5 spare3: 0xe5e5
 consistency value in tail: 0x4c390601
 check value in block header: 0xe5e5
 computed block checksum: 0x5003



DBVERIFY - 验证完成

检查的页总数: 524288
处理的页总数 (数据): 204510
失败的页总数 (数据): 0
处理的页总数 (索引): 127485
失败的页总数 (索引): 0
处理的页总数 (其他): 3030
处理的总页数 (段)  : 0
失败的总页数 (段)  : 0
空的页总数: 189262
标记为损坏的总页数: 1
流入的页总数: 0
加密的总页数        : 0
最高块 SCN            : 184063522 (3470.184063522)

运气不错,就一个数据库block异常,通过dba_extents查询坏块所属对象,运气不太好是一个表数据

SQL> SELECT OWNER, SEGMENT_NAME, SEGMENT_TYPE, TABLESPACE_NAME, A.PARTITION_NAME
  2    FROM DBA_EXTENTS A
  3   WHERE FILE_ID = &FILE_ID
  4     AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1;
输入 file_id 的值:  88
原值    3:  WHERE FILE_ID = &FILE_ID
新值    3:  WHERE FILE_ID = 88
输入 block_id 的值:  323055
原值    4:    AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1
新值    4:    AND 323055 BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1

OWNER
------------------------------
SEGMENT_NAME
--------------------------------------------------------------------------------
SEGMENT_TYPE       TABLESPACE_NAME                PARTITION_NAME
------------------ ------------------------------ ------------------------------
XFFUSER
XFFTABLE
TABLE              XFFTBS_TAB2

设置跳过坏块导出数据,然后重命名原表导入数据,完成本次恢复,以前有过类似恢复:

一次侥幸的OSD-04016 O/S-Error异常恢复
O/S-Error: (OS 23) 数据错误(循环冗余检查) 数据库恢复

异常断电数据库恢复-从ORA-600 2131到ORA-08102: 未找到索引关键字, 对象号 39

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:异常断电数据库恢复-从ORA-600 2131到ORA-08102: 未找到索引关键字, 对象号 39

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报ORA-600 2131,以前遇到过类似问题:ORA-600 2131故障处理

SQL> alter database mount;
alter database mount
*
第 1 行出现错误:
ORA-00600: ??????, ??: [2131], [9], [8], [], [], [], [], [], [], [], [], []
Tue Jun 04 14:12:18 2024
RECO started with pid=15, OS id=3244 
Tue Jun 04 14:12:18 2024
MMON started with pid=16, OS id=3256 
Tue Jun 04 14:12:18 2024
MMNL started with pid=17, OS id=3432 
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 shared server(s) ...
ORACLE_BASE from environment = E:\app\Administrator
Tue Jun 04 14:12:22 2024
alter database mount exclusive
Errors in file E:\APP\ADMINISTRATOR\diag\rdbms\XFF\XFF\trace\XFF_ora_2536.trc  (incident=427583):
ORA-00600: ??????, ??: [2131], [9], [8], [], [], [], [], [], [], [], [], []
Tue Jun 04 14:12:28 2024
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORA-600 signalled during: alter database mount exclusive...

重建ctl,然后重试recover 数据库,报ORA-600 kdourp_inorder2ORA-600 3020错误,这些错误本质都是由于redo信息和block信息不匹配导致

SQL> recover datafile 1;
ORA-00283: 恢复会话因错误而取消
ORA-10562: Error occurred while applying redo to data block (file# 1, block# 74805)
ORA-10564: tablespace SYSTEM
ORA-01110: 数据文件 1: 'E:\ORADATA\XFF\SYSTEM01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 8
ORA-00600: 内部错误代码, 参数: [kdourp_inorder2], [16], [3], [0], [108], [], [], [], [], [], [], []


SQL> recover datafile 7;
ORA-00283: 恢复会话因错误而取消
ORA-00600: 内部错误代码, 参数: [3020], [7], [385], [29360513], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 7, block# 385, file offset is 3153920 bytes)
ORA-10564: tablespace UNDOTBS2
ORA-01110: 数据文件 7: 'E:\ORADATA\XFF\UNDOTBS2.DBF'
ORA-10560: block type 'KTU UNDO BLOCK'

通过屏蔽一致性,修改文件头scn,强制打开数据库

SQL> recover database until cancel;
ORA-00279: 更改 56782359 (在 06/04/2024 14:00:36 生成) 对于线程 1 是必需的
ORA-00289: 建议: E:\APP\ARCHIVELOG\ARC0000005415_1165094245.0001
ORA-00280: 更改 56782359 (用于线程 1) 在序列 #5415 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: 警告: RECOVER 成功但 OPEN RESETLOGS 将出现如下错误
ORA-01194: 文件 1 需要更多的恢复来保持一致性
ORA-01110: 数据文件 1: 'E:\ORADATA\XFF\SYSTEM01.DBF'


ORA-01112: 未启动介质恢复


SQL> alter database open resetlogs;

数据库已更改。

尝试导出数据报ORA-08102,导致数据库无法正常导出

C:\Users\Administrator>expdp "'/ as sysdba'" full=y dumpfile=full_20240604_%U.dmp DIRECTORY=expdp_dir 
logfile=full_20240604.log parallel=2 EXCLUDE=STATISTICS,AUDIT

Export: Release 11.2.0.4.0 - Production on 星期二 6月 4 18:40:26 2024

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORA-31626: 作业不存在
ORA-31633: 无法创建主表 "SYS.SYS_EXPORT_FULL_05"
ORA-06512: 在 "SYS.DBMS_SYS_ERROR", line 95
ORA-06512: 在 "SYS.KUPV$FT", line 1038
ORA-08102: 未找到索引关键字, 对象号 39, 文件 1, 块 97540 (2)

obj 39 为OBJ$的I_OBJ4对象报ORA-08102

SQL> select owner,object_name,object_type from dba_objects where object_id=39
  2  /

OWNER                          OBJECT_NAME                    OBJECT_TYPE
------------------------------ ------------------------------ -------------------
SYS                            I_OBJ4                         INDEX

该对象属于bootstrap$中核心对象,无法直接rebuild,参考下面文章处理,然后再尝试导出数据
分享I_OBJ4 ORA-8102故障恢复案例
使用bbed 修复I_OBJ4 index 报ORA-8102错误
bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决

C:\Users\Administrator>expdp "'/ as sysdba'" full=y dumpfile=full_20240604_%U.dmp DIRECTORY=expdp_dir 
logfile=full_20240604.log parallel=2 EXCLUDE=STATISTICS,AUDIT

Export: Release 11.2.0.4.0 - Production on 星期二 6月 4 18:43:47 2024

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORA-31626: 作业不存在
ORA-31637: 无法创建作业 SYS_EXPORT_FULL_01 (用户 SYS)
ORA-06512: 在 "SYS.DBMS_SYS_ERROR", line 95
ORA-06512: 在 "SYS.KUPV$FT_INT", line 798
ORA-39080: 无法为数据泵作业创建队列 "KUPC$C_1_20240604184348" 和 "KUPC$S_1_20240604184348"
ORA-06512: 在 "SYS.DBMS_SYS_ERROR", line 95
ORA-06512: 在 "SYS.KUPC$QUE_INT", line 1534
ORA-08102: 未找到索引关键字, 对象号 53, 文件 1, 块 97715 (2)

通过类似方法分析确认为CDEF$的I_CDEF1 index,处理方法和I_OBJ4一样,然后导出数据成功,导入到新库中,在这个迁移过程中遭遇Wrapped 加密的package body无效的问题,具体参见:数据泵迁移Wrapped PLSQL之后报PLS-00753

数据库open报ORA-600 kcratr_scan_lastbwr故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:数据库open报ORA-600 kcratr_scan_lastbwr故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

由于断电,导致数据库正常open报ORA-600 kcratr_scan_lastbwr错误

Wed Jan 17 18:23:26 2024
ALTER DATABASE   MOUNT
Successful mount of redo thread 1, with mount id 1028618590
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Wed Jan 17 18:23:30 2024
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
 parallel recovery started with 32 processes
Started redo scan
Hex dump of (file 3, block 144) in trace file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_66361.trc
Reading datafile '/database/oracle/app/oracle/oradata/xff/datafile/o1_mf_undotbs1_hct7001s_.dbf' 
  for corruption at rdba: 0x00c00090 (file 3, block 144)
Reread (file 3, block 144) found same corrupt data (logically corrupt)
Write verification failed for File 3 Block 144 (rdba 0xc00090)
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_66361.trc  (incident=672241):
ORA-00600: internal error code, arguments: [kcratr_scan_lastbwr], [], [], [], [], []
Incident details in: /database/oracle/app/oracle/diag/rdbms/xff/xff/incident/incdir_672241/xff_ora_66361_i672241.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Aborting crash recovery due to error 600
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_66361.trc:
ORA-00600: internal error code, arguments: [kcratr_scan_lastbwr], [], [], [], [], []
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_66361.trc:
ORA-00600: internal error code, arguments: [kcratr_scan_lastbwr], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...

尝试recover 数据库报ORA-600 3020错误

Wed Jan 17 18:28:38 2024
ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 96 slaves
Wed Jan 17 18:28:41 2024
Recovery of Online Redo Log: Thread 1 Group 2 Seq 410864 Reading mem 0
  Mem# 0: /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_2_hct740hq_.log
Wed Jan 17 18:28:42 2024
ORA-00600: internal error code, arguments: [3020], [3], [240], [12583152], [], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 240, file offset is 1966080 bytes)
ORA-10564: tablespace UNDOTBS1
ORA-01110: data file 3: '/database/oracle/app/oracle/oradata/xff/datafile/o1_mf_undotbs1_hct7001s_.dbf'
ORA-10560: block type 'KTU SMU HEADER BLOCK'
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Slave exiting with ORA-600 exception
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_pr19_68212.trc:
ORA-00600: internal error code, arguments: [3020], [3], [240], [12583152], [], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 240, file offset is 1966080 bytes)
ORA-10564: tablespace UNDOTBS1
ORA-01110: data file 3: '/database/oracle/app/oracle/oradata/xff/datafile/o1_mf_undotbs1_hct7001s_.dbf'
ORA-10560: block type 'KTU SMU HEADER BLOCK'
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_68038.trc  (incident=672243):
ORA-00600: internal error code, arguments: [3020], [3], [240], [12583152], [], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 240, file offset is 1966080 bytes)
ORA-10564: tablespace UNDOTBS1
ORA-01110: data file 3: '/database/oracle/app/oracle/oradata/xff/datafile/o1_mf_undotbs1_hct7001s_.dbf'
ORA-10560: block type 'KTU SMU HEADER BLOCK'
Wed Jan 17 18:28:43 2024
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Recovery Slave PR19 previously exited with exception 600
Wed Jan 17 18:28:43 2024
Sweep [inc][672865]: completed
Media Recovery failed with error 448
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_pr00_68115.trc:
ORA-00283: recovery session canceled due to errors
ORA-00448: normal completion of background process
ORA-600 signalled during: ALTER DATABASE RECOVER  database  ...

加上隐含参数尝试强制拉库

alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 6165467436
Clearing online redo logfile 1 /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_1_hct740fp_.log
Clearing online log 1 of thread 1 sequence number 0
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_2_hct740hq_.log
Clearing online log 2 of thread 1 sequence number 0
Clearing online redo logfile 2 complete
Clearing online redo logfile 3 /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_3_hct740k7_.log
Clearing online log 3 of thread 1 sequence number 0
Clearing online redo logfile 3 complete
Online log /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_1_hct740fp_.log: Thread 1 Group 1 was previously cleared
Online log /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_2_hct740hq_.log: Thread 1 Group 2 was previously cleared
Online log /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_3_hct740k7_.log: Thread 1 Group 3 was previously cleared
Fri Jan 19 09:24:59 2024
Setting recovery target incarnation to 2
Initializing SCN for created control file
Database SCN compatibility initialized to 3
Warning - High Database SCN: Current SCN value is 6165467439, threshold SCN value is 0
If you have not previously reported this warning on this database,
  please notify Oracle Support so that additional diagnosis can be performed.
Fri Jan 19 09:24:59 2024
Assigning activation ID 1028784413 (0x3d52011d)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_1_hct740fp_.log
Successful open of redo thread 1
Fri Jan 19 09:24:59 2024
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Jan 19 09:24:59 2024
SMON: enabling cache recovery
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_84860.trc  (incident=1344255):
ORA-00600: internal error code, arguments: [2662], [1], [1870500147], [1], [1870515285], [12583040]
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_84860.trc:
ORA-00600: internal error code, arguments: [2662], [1], [1870500147], [1], [1870515285], [12583040]
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_84860.trc:
ORA-00600: internal error code, arguments: [2662], [1], [1870500147], [1], [1870515285], [12583040]
Error 600 happened during db open, shutting down database
USER (ospid: 84860): terminating the instance due to error 600
Instance terminated by USER, pid = 84860
ORA-1092 signalled during: alter database open resetlogs...

客户自行恢复到这一步,后面无法处理,接手之后进行恢复,其实后面比较简单了,就是修改下数据库scn,数据库就可以open起来,然后处理异常的undo和对象即可,可以参考以前类似文章:
ORA-600 2662快速恢复之Patch scn工具
硬件故障导致ORA-600 2662错误处理
ORA-00600 [2662]和ORA-00600 [4194]恢复
更多参考:惜分飞blog中ORA-600 2662文章

resetlogs强制拉库失败并使用备份system文件还原数据库故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:resetlogs强制拉库失败并使用备份system文件还原数据库故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接手一个库,在open的过程中遭遇到ORA-600 2662错误

Sun May 26 10:15:54 2024
alter database open RESETLOGS
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 84303583
Clearing online redo logfile 1 /data/OracleData/xff/redo01.log
Clearing online log 1 of thread 1 sequence number 8330
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 /data/OracleData/xff/redo02.log
Clearing online log 2 of thread 1 sequence number 8327
Clearing online redo logfile 2 complete
Clearing online redo logfile 3 /data/OracleData/xff/redo03.log
Clearing online log 3 of thread 1 sequence number 8329
Clearing online redo logfile 3 complete
Clearing online redo logfile 4 /data/OracleData/xff/redo04.log
Clearing online log 4 of thread 1 sequence number 8328
Clearing online redo logfile 4 complete
Resetting resetlogs activation ID 1431370398 (0x5550fa9e)
Online log /data/OracleData/xff/redo01.log: Thread 1 Group 1 was previously cleared
Online log /data/OracleData/xff/redo02.log: Thread 1 Group 2 was previously cleared
Online log /data/OracleData/xff/redo03.log: Thread 1 Group 3 was previously cleared
Online log /data/OracleData/xff/redo04.log: Thread 1 Group 4 was previously cleared
Sun May 26 10:15:59 2024
Setting recovery target incarnation to 3
Sun May 26 10:15:59 2024
Read of datafile '/data/OracleData/xff/temp01.dbf' (fno 201) header failed with ORA-01200
Rereading datafile 201 header failed with ORA-01200
Errors in file /data/u01/app/oracle/diag/rdbms/xff/xff/trace/xff_dbw0_1563.trc:
ORA-01186: file 201 failed verification tests
ORA-01122: database file 201 failed verification check
ORA-01110: data file 201: '/data/OracleData/xff/temp01.dbf'
ORA-01200: actual file size of 3711 is smaller than correct size of 3712 blocks
File 201 not verified due to error ORA-01122
Sun May 26 10:15:59 2024
Assigning activation ID 1509069065 (0x59f29109)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /data/OracleData/xff/redo01.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sun May 26 10:15:59 2024
SMON: enabling cache recovery
Errors in file /data/u01/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_1590.trc  (incident=163897):
ORA-00600: internal error code, arguments: [2662], [0], [84303590], [0], [84314659], [12583040] 
Incident details in:/data/u01/app/oracle/diag/rdbms/xff/xff/incident/incdir_163897/xff_ora_1590_i163897.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /data/u01/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_1590.trc:
ORA-00600: internal error code, arguments: [2662], [0], [84303590], [0], [84314659], [12583040] 
Errors in file /data/u01/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_1590.trc:
ORA-00600: internal error code, arguments: [2662], [0], [84303590], [0], [84314659], [12583040] 
Error 600 happened during db open, shutting down database
USER (ospid: 1590): terminating the instance due to error 600

然后客户使用备份的system01.dbf文件替换了被resetlogs之后文件,导致数据库后续操作无法继续

SQL> recover database;
ORA-00283: recovery session canceled due to errors
ORA-19909: datafile 1 belongs to an orphan incarnation
ORA-01110: data file 1: '/data/OracleData/xff/system01.dbf'

这个问题比较简单,通过bbed或者Oracle Recovery Tools修改文件头相关信息,然后open数据库成功
重建控制文件丢失数据文件导致悲剧
Oracle Recovery Tools快速恢复ORA-19909

SQL> recover datafile 1;
Media recovery complete.
SQL> recover database;
Media recovery complete.
SQL> alter database open;

Database altered.

但是由于system文件有大量坏块导致数据库无法正常登录和导出

[oracle@et-dbserver ~]$ exp "'/ as sysdba'" owner=USERNAME  file=/tmp/2user.dmp log=/tmp/2user.log 

Export: Release 11.2.0.4.0 - Production on Sun May 26 13:00:50 2024

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

EXP-00056: ORACLE error 604 encountered
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 86500)
ORA-01110: data file 1: '/data/OracleData/xff/system01.dbf'
Username: / as sysdba

EXP-00056: ORACLE error 604 encountered
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 86500)
ORA-01110: data file 1: '/data/OracleData/xff/system01.dbf'
Username:
Password:

EXP-00056: ORACLE error 604 encountered
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 86500)
ORA-01110: data file 1: '/data/OracleData/xff/system01.dbf'
ORA-01017: invalid username/password; logon denied
EXP-00005: all allowable logon attempts failed
EXP-00000: Export terminated unsuccessfully

通过dbv检查system数据文件

DBVERIFY: Release 11.2.0.4.0 - Production on Sun May 26 12:33:28 2024

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


DBVERIFY - Verification starting : FILE = /data/OracleData/xff/system01.dbf
Page 1044 is influx - most likely media corrupt
Corrupt block relative dba: 0x00400414 (file 1, block 1044)
Fractured block found during dbv: 
Data in bad block:
 type: 0 format: 2 rdba: 0x00400414
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x1d7f550b
 check value in block header: 0xa354
 computed block checksum: 0x6830

Page 1103 is marked corrupt
Corrupt block relative dba: 0x0040044f (file 1, block 1103)
Bad header found during dbv: 
Data in bad block:
 type: 0 format: 0 rdba: 0x00000000
 last change scn: 0x508f.5f74492e seq: 0x53 flg: 0x0c
 spare1: 0xc spare2: 0xa6 spare3: 0xc757
 consistency value in tail: 0x00000001
 check value in block header: 0x8925
 computed block checksum: 0x5d3b

Page 1143 is marked corrupt
Corrupt block relative dba: 0x00400477 (file 1, block 1143)
Bad header found during dbv: 
Data in bad block:
 type: 0 format: 0 rdba: 0x00000001
 last change scn: 0x65c4.52eb202e seq: 0x28 flg: 0x0e
 spare1: 0xe spare2: 0xe2 spare3: 0xfa46
 consistency value in tail: 0x00000001
 check value in block header: 0x6405
 computed block checksum: 0x28b1

………………

Page 124805 is influx - most likely media corrupt
Corrupt block relative dba: 0x0041e785 (file 1, block 124805)
Fractured block found during dbv: 
Data in bad block:
 type: 6 format: 2 rdba: 0x0041e785
 last change scn: 0x0000.0434fc6c seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x1991255b
 check value in block header: 0x6386
 computed block checksum: 0x1384


DBVERIFY - Verification complete

Total Pages Examined         : 130560
Total Pages Processed (Data) : 95634
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 14949
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 1774
Total Pages Processed (Seg)  : 1669
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 16251
Total Pages Marked Corrupt   : 283
Total Pages Influx           : 149
Total Pages Encrypted        : 0
Highest block SCN            : 84314727 (0.84314727)

对于这样问题,通过Oracle Recovery Tools实战批量坏块修复,实现数据库可以完美导出数据

ORA-600 2131故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 2131故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报ORA-600 2131错误,查看alert日志发现是在mount过程报错

Fri May 17 20:58:28 2024
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Initial number of CPU is 16
Number of processor cores in the system is 8
Number of processor sockets in the system is 1
Picked latch-free SCN scheme 3
Autotune of undo retention is turned on. 
IMODE=BR
ILAT =249
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
Windows NT Version V6.2  
CPU                 : 16 - type 8664, 8 Physical Cores
Process Affinity    : 0x0x0000000000000000
Memory (Avail/Total): Ph:93799M/97925M, Ph+PgF:78891M/112261M 
Using parameter settings in server-side spfile E:\APP\ADMINISTRATOR\PRODUCT\11.2.0\DBHOME_1\DATABASE\SPFILEXFF.ORA
System parameters with non-default values:
  processes                = 1500
  sessions                 = 2272
  nls_language             = "SIMPLIFIED CHINESE"
  nls_territory            = "CHINA"
  sga_target               = 29440M
  control_files            = "E:\ORADATA\xff\CONTROL01.CTL"
  db_block_size            = 8192
  compatible               = "11.2.0.4.0"
  log_archive_dest_1       = "LOCATION=e:\app\archivelog\"
  log_archive_format       = "ARC%S_%R.%T"
  undo_tablespace          = "UNDOTBS2"
  sec_case_sensitive_logon = FALSE
  remote_login_passwordfile= "EXCLUSIVE"
  db_domain                = ""
  dispatchers              = "(PROTOCOL=TCP) (SERVICE=xffXDB)"
  audit_file_dest          = "E:\APP\ADMINISTRATOR\ADMIN\xff\ADUMP"
  audit_trail              = "NONE"
  db_name                  = "xff"
  open_cursors             = 300
  pga_aggregate_target     = 9792M
  diagnostic_dest          = "E:\APP\ADMINISTRATOR"
Fri May 17 20:58:29 2024
PMON started with pid=2, OS id=6696 
Fri May 17 20:58:29 2024
PSP0 started with pid=3, OS id=2424 
Fri May 17 20:58:30 2024
VKTM started with pid=4, OS id=5472 at elevated priority
VKTM running at (10)millisec precision with DBRM quantum (100)ms
Fri May 17 20:58:30 2024
GEN0 started with pid=5, OS id=5764 
Fri May 17 20:58:30 2024
DIAG started with pid=6, OS id=372 
Fri May 17 20:58:30 2024
DBRM started with pid=7, OS id=2992 
Fri May 17 20:58:30 2024
DIA0 started with pid=8, OS id=4960 
Fri May 17 20:58:30 2024
MMAN started with pid=9, OS id=6036 
Fri May 17 20:58:30 2024
DBW0 started with pid=10, OS id=4724 
Fri May 17 20:58:30 2024
DBW1 started with pid=11, OS id=2652 
Fri May 17 20:58:30 2024
LGWR started with pid=12, OS id=5320 
Fri May 17 20:58:30 2024
CKPT started with pid=13, OS id=5732 
Fri May 17 20:58:30 2024
SMON started with pid=14, OS id=936 
Fri May 17 20:58:30 2024
RECO started with pid=15, OS id=2192 
Fri May 17 20:58:30 2024
MMON started with pid=16, OS id=5576 
Fri May 17 20:58:30 2024
MMNL started with pid=17, OS id=5748 
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 shared server(s) ...
ORACLE_BASE from environment = E:\app\Administrator
Fri May 17 20:58:31 2024
ALTER DATABASE   MOUNT
Errors in file E:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_ora_5452.trc  (incident=403399):
ORA-00600: ??????, ??: [2131], [9], [8], [], [], [], [], [], [], [], [], []
Incident details in: E:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\incident\incdir_403399\xff_ora_5452_i403399.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORA-600 signalled during: ALTER DATABASE   MOUNT...

这个错误是由于controlfile损坏导致,有这个库以前部署过rman备份,解决起来比较简单,使用rman还原控制文件,并尝试recover

RMAN> restore controlfile from 'E:\rmanback\rmanfile\CTL_20240517_A62R067K_1_1.RMAN';

启动 restore 于 17-5月 -24
使用通道 ORA_DISK_1

通道 ORA_DISK_1: 正在还原控制文件
通道 ORA_DISK_1: 还原完成, 用时: 00:00:01
输出文件名=E:\ORADATA\XFF\CONTROL01.CTL
完成 restore 于 17-5月 -24

RMAN>

RMAN>

RMAN> alter database mount;

数据库已装载
释放的通道: ORA_DISK_1

RMAN> recover database;

启动 recover 于 17-5月 -24
分配的通道: ORA_DISK_1
通道 ORA_DISK_1: SID=996 设备类型=DISK

正在开始介质的恢复

线程 1 序列 4100 的归档日志已作为文件 E:\ORADATA\XFF\REDO02.LOG 存在于磁盘上
线程 1 序列 4101 的归档日志已作为文件 E:\ORADATA\XFF\REDO03.LOG 存在于磁盘上
线程 1 序列 4102 的归档日志已作为文件 E:\ORADATA\XFF\REDO01.LOG 存在于磁盘上
归档日志文件名=E:\APP\ARCHIVELOG\ARC0000004025_1165094245.0001 线程=1 序列=4025
归档日志文件名=E:\APP\ARCHIVELOG\ARC0000004026_1165094245.0001 线程=1 序列=4026
…………
归档日志文件名=E:\APP\ARCHIVELOG\ARC0000004099_1165094245.0001 线程=1 序列=4099
归档日志文件名=E:\ORADATA\XFF\REDO02.LOG 线程=1 序列=4100
归档日志文件名=E:\ORADATA\XFF\REDO03.LOG 线程=1 序列=4101
归档日志文件名=E:\ORADATA\XFF\REDO01.LOG 线程=1 序列=4102
介质恢复完成, 用时: 00:00:22
完成 recover 于 17-5月 -24

RMAN> exit


恢复管理器完成。

E:\oradata\XFF>

这种恢复情况下,如果现在要打开库,需要resetlogs方式,考虑通过创建ctl直接打开(不想用resetlogs)

SQL> shutdown immediate;
ORA-01109: 数据库未打开


已经卸载数据库。
ORACLE 例程已经关闭。
SQL> startup nomount;
ORACLE 例程已经启动。

Total System Global Area 3.0732E+10 bytes
Fixed Size                  2296264 bytes
Variable Size            3825206840 bytes
Database Buffers         2.6844E+10 bytes
Redo Buffers               61206528 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  ARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 876
  7  LOGFILE
  8    GROUP 1 'E:\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  9    GROUP 2 'E:\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
 10    GROUP 3 'E:\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
 11  -- STANDBY LOGFILE
 12  DATAFILE
 13    'E:\ORADATA\XFF\SYSTEM01.DBF',
 14    'E:\ORADATA\XFF\SYSAUX01.DBF',
 15    'E:\ORADATA\XFF\USERS01.DBF',
 16    'E:\ORADATA\XFF\XFF_DATA01.DBF',
 17    'E:\ORADATA\XFF\XFF_INDEX01.DBF',
 18    'E:\ORADATA\XFF\UNDOTBS2.DBF'
 19  CHARACTER SET ZHS16GBK
 20  ;

控制文件已创建。

SQL> recover database;
完成介质恢复。
SQL> alter database open;

数据库已更改。

SQL> ALTER TABLESPACE TEMP ADD TEMPFILE 'E:\ORADATA\XFF\TEMP01.DBF' REUSE;

表空间已更改。

至此本次恢复晚上,由于arch,redo和数据文件没有损坏,恢复非常完美,参考以前类似说明:ORA-600 2131故障说明

分布式存储故障导致数据库无法启动故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:分布式存储故障导致数据库无法启动故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

国内xx医院使用了国外医疗行业龙头的pacs系统,由于是一个历史库,存放在分布式存储中,由于存储同时多个节点故障,导致数据库多个文件异常,数据库无法启动,三方维护人员尝试通通过rman归档进行应用日志,结果发现日志有损坏报ORA-00354 ORA-00353,无法记录恢复,希望我们给予支持
ORA-00353

Mon Apr 29 13:28:40 2024
Media Recovery failed with error 354
Mon Apr 29 13:28:40 2024
Errors in file F:\XXXXXX_DB\ORACLE\ADMIN\diag\rdbms\xxx\msxxx1\trace\xxx_pr00_4568.trc:
ORA-00283: recovery session canceled due to errors
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 487424 change 8737273868 time 04/01/2024 01:38:25
ORA-00334: archived log: 'F:\XXXXXX_DB\ORADATA\XXX\LOGARC0000052184_0922116268.0001'
ORA-283 signalled during: alter database recover logfile 'F:\XXXXXX_DB\ORADATA\XXX\LOGARC0000052184_0922116268.0001'...

接手故障之后,通过尝试恢复发现除该错误之外,还有ORA-600 4552之类错误
ORA-600-4552


跳过这些异常文件恢复,最终确认异常文件有如下部分
20240511223413

这些文件由于日志无法正常应用(有日志损坏无法应用,有日志和数据文件block不匹配导致无法应用),这样的情况直接通过自研的Oracle Recovery Tools小工具直接修改文件头信息
orarecovery

然后尝试OPEN数据库结果报ORA-1207
ORA-1207

对于这个故障可以通过rectl或者using backup ctl方式处理,然后open数据库成功
20240510224845

由于该系统是历史库,不会有新业务写入,通过对异常表和索引进行处理之后,客户测试业务可以正常访问,完成本次恢复

存储故障后oracle报—ORA-01122/ORA-01207故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:存储故障后oracle报—ORA-01122/ORA-01207故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户存储异常,通过硬件恢复解决存储故障之后,oracle数据库无法正常启动(存储cache丢失),尝试recover数据库报ORA-00283 ORA-01122 ORA-01110 ORA-01207错误
以前处理过比较类似的存储故障case:
又一起存储故障导致ORA-00333 ORA-00312恢复
存储故障,强制拉库报ORA-600 kcbzib_kcrsds_1处理

SQL> recover database until cancel;
ORA-00283: 恢复会话因错误而取消
ORA-01122: 数据库文件 536 验证失败
ORA-01110: 数据文件 536: '+DATA/orcl/dt_img_dat511.ora'
ORA-01207: 文件比控制文件更新 - 旧的控制文件
Sun May 05 00:09:03 2024
ALTER DATABASE RECOVER  database until cancel  
Media Recovery Start
 started logmerger process
Sun May 05 00:09:10 2024
SUCCESS: diskgroup FRA was mounted
Sun May 05 00:09:10 2024
NOTE: dependency between database orcl and diskgroup resource ora.FRA.dg is established
Sun May 05 00:09:14 2024
WARNING! Recovering data file 1 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
Media Recovery failed with error 1122
Slave exiting with ORA-283 exception
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_pr00_8048.trc:
ORA-00283: 恢复会话因错误而取消
ORA-01122: 数据库文件 536 验证失败
ORA-01110: 数据文件 536: '+DATA/orcl/dt_img_dat511.ora'
ORA-01207: 文件比控制文件更新 - 旧的控制文件
Sun May 05 00:09:16 2024
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...

using backup controlfile进行恢复

SQL> recover database using backup controlfile until cancel;
ORA-00279: 更改 18646239951 (在 04/25/2024 17:14:50 生成) 对于线程 1 是必需的
ORA-00289: 建议:
+FRA/orcl/archivelog/2024_04_25/thread_1_seq_1003886.199435.1167240505
ORA-00280: 更改 18646239951 (用于线程 1) 在序列 #1003886 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00279: 更改 18646239951 (在 04/25/2024 17:11:40 生成) 对于线程 2 是必需的
ORA-00289: 建议:
+FRA/orcl/archivelog/2024_04_25/thread_2_seq_677876.199531.1167239807
ORA-00280: 更改 18646239951 (用于线程 2) 在序列 #677876 中


ORA-00279: 更改 18646255791 (在 04/25/2024 17:16:46 生成) 对于线程 2 是必需的
ORA-00289: 建议:
+FRA/orcl/archivelog/2024_04_25/thread_2_seq_677877.199472.1167240099
ORA-00280: 更改 18646255791 (用于线程 2) 在序列 #677877 中
ORA-00278: 此恢复不再需要日志文件
'+FRA/orcl/archivelog/2024_04_25/thread_2_seq_677876.199531.1167239807'


ORA-00279: 更改 18646295647 (在 04/25/2024 17:21:38 生成) 对于线程 2 是必需的
ORA-00289: 建议:
+FRA/orcl/archivelog/2024_04_25/thread_2_seq_677878.199379.1167240623
ORA-00280: 更改 18646295647 (用于线程 2) 在序列 #677878 中
ORA-00278: 此恢复不再需要日志文件
'+FRA/orcl/archivelog/2024_04_25/thread_2_seq_677877.199472.1167240099'


ORA-00279: 更改 18646331784 (在 04/25/2024 17:28:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
+FRA/orcl/archivelog/2024_04_25/thread_1_seq_1003887.199320.1167241507
ORA-00280: 更改 18646331784 (用于线程 1) 在序列 #1003887 中
ORA-00278: 此恢复不再需要日志文件
'+FRA/orcl/archivelog/2024_04_25/thread_1_seq_1003886.199435.1167240505'


ORA-00308: 无法打开归档日志
'+FRA/orcl/archivelog/2024_04_25/thread_1_seq_1003887.199320.1167241507'
ORA-17503: ksfdopn: 2 未能打开文件
+FRA/orcl/archivelog/2024_04_25/thread_1_seq_1003887.199320.1167241507
ORA-15012: ASM file
'+FRA/orcl/archivelog/2024_04_25/thread_1_seq_1003887.199320.1167241507' does not exist


ORA-10879: error signaled in parallel recovery slave
ORA-01547: 警告: RECOVER 成功但 OPEN RESETLOGS 将出现如下错误
ORA-01194: 文件 1 需要更多的恢复来保持一致性
ORA-01110: 数据文件 1: '+DATA/orcl/system01.dbf'

通过分析,确认由于cache丢失导致thread_1_seq_1003887这个日志丢失(而且redo已经被覆盖)
20240506-2


20240506-1

数据库无法通过正常recover的思路解决.通过屏蔽一致性,强制打开数据库,alert日志报ORA-600 2662错误

Sat May 04 17:23:00 2024
Redo thread 2 internally disabled at seq 1 (CKPT)
ARC1: Archiving disabled thread 2 sequence 1
Archived Log entry 2 added for thread 2 sequence 1 ID 0x0 dest 1:
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_ora_3684.trc  (incident=47066):
ORA-00600: ??????, ??: [2662], [4], [1466533588], [4], [1466584862], [12583040], [], [], [], [], [], []
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_ora_3684.trc:
ORA-00600: ??????, ??: [2662], [4], [1466533588], [4], [1466584862], [12583040], [], [], [], [], [], []
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_ora_3684.trc:
ORA-00600: ??????, ??: [2662], [4], [1466533588], [4], [1466584862], [12583040], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 3684): terminating the instance due to error 600
Instance terminated by USER, pid = 3684
ORA-1092 signalled during: alter database open resetlogs...

通过修改数据库scn,open数据库报ORA-600 4137

Sun May 05 00:12:41 2024
replication_dependency_tracking turned off (no async multimaster replication found)
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Completed: alter database open resetlogs
Sun May 05 00:12:56 2024
Trace dumping is performing id=[cdmp_20240505001256]
Sun May 05 00:12:56 2024
ORACLE Instance orcl1 (pid = 22) - Error 600 encountered while recovering transaction (28, 21).
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_smon_5896.trc:
ORA-00600: ??????, ??: [4137], [28.21.42965783], [0], [0], [], [], [], [], [], [], [], []

这个错误比较明显,由于28号回滚段异常导致,对异常回滚段进行处理,重建undo,数据库恢复主要工作完成

Oracle 23ai rm redo*.log恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle 23ai rm redo*.log恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在oracle 23ai的pdb中创建用户和表,并且插入数据(不提交),在另外一个会话中abort库,并从os层面rm删除掉redo文件,模拟数据库当前redo丢失,数据库恢复
创建用户和表并插入数据

[oracle@xifenfei ~]$ sqlplus sys/oracle@free as sysdba

SQL*Plus: Release 23.0.0.0.0 - Production on Fri May 3 15:40:55 2024
Version 23.4.0.24.05

Copyright (c) 1982, 2024, Oracle.  All rights reserved.


Connected to:
Oracle Database 23ai Free Release 23.0.0.0.0 - Develop, Learn, and Run for Free
Version 23.4.0.24.05

SQL> 
SQL> 
SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 FREEPDB1                       READ WRITE NO

SQL> alter session set container=FREEPDB1;

Session altered.

SQL> create user xff identified by oracle;

User created.

SQL> grant dba to xff ;

Grant succeeded.

SQL> conn xff/oracle@FREEPDB1
Connected.
SQL> create table t1 as select * from dba_objects;

Table created.


SQL> insert into t1 select *from t1;

75877 rows created.

SQL> /

151754 rows created.

另外一个会话中abort库

[oracle@xifenfei ~]$ sqlplus sys/oracle@free as sysdba

SQL*Plus: Release 23.0.0.0.0 - Production on Fri May 3 15:43:30 2024
Version 23.4.0.24.05

Copyright (c) 1982, 2024, Oracle.  All rights reserved.


Connected to:
Oracle Database 23ai Free Release 23.0.0.0.0 - Develop, Learn, and Run for Free
Version 23.4.0.24.05

SQL> shutdown abort;
ORACLE instance shut down.
SQL> 
SQL> 
SQL> 
SQL> exit
Disconnected from Oracle Database 23ai Free Release 23.0.0.0.0 - Develop, Learn, and Run for Free
Version 23.4.0.24.05

操作系统层面rm -rf 删除redo

[oracle@xifenfei ~]$ cd $ORACLE_BASE/oradata
[oracle@xifenfei oradata]$ ls
FREE
[oracle@xifenfei oradata]$ cd FREE/
[oracle@192 FREE]$ ls
control01.ctl  FREEPDB1  redo01.log  redo03.log    system01.dbf  undotbs2.dbf
control02.ctl  pdbseed   redo02.log  sysaux01.dbf  temp01.dbf    users01.dbf
[oracle@xifenfei FREE]$ ls -ltr
total 2441036
drwxr-x---. 2 oracle     1000         85 May  1 16:49 pdbseed
-rw-r-----. 1 oracle oinstall   20979712 May  1 16:51 temp01.dbf
drwxr-x---. 2 oracle     1000        104 May  1 16:55 FREEPDB1
-rw-r-----. 1 oracle oinstall  209715712 May  3 15:23 redo01.log
-rw-r-----. 1 oracle oinstall  209715712 May  3 15:23 redo02.log
-rw-r-----. 1 oracle oinstall    7348224 May  3 15:23 users01.dbf
-rw-r-----. 1 oracle oinstall 1080041472 May  3 15:43 system01.dbf
-rw-r-----. 1 oracle oinstall  692068352 May  3 15:43 sysaux01.dbf
-rw-rw----. 1 oracle oinstall   52436992 May  3 15:43 undotbs2.dbf
-rw-r-----. 1 oracle oinstall  209715712 May  3 15:43 redo03.log
-rw-r-----. 1 oracle oinstall   18759680 May  3 15:43 control01.ctl
-rw-r-----. 1 oracle oinstall   18759680 May  3 15:43 control02.ctl
[oracle@xifenfei FREE]$ rm -rf redo0*
[oracle@192 FREE]$ ls -l redo*
ls: cannot access 'redo*': No such file or directory
[oracle@xifenfei FREE]$ 

尝试启动数据库,报ora-00313,ora-00312,ora-27037等错误

[oracle@xifenfei ~]$ sqlplus sys/oracle@free as sysdba

SQL*Plus: Release 23.0.0.0.0 - Production on Fri May 3 15:44:17 2024
Version 23.4.0.24.05

Copyright (c) 1982, 2024, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup mount;
ORACLE instance started.

Total System Global Area 1603726344 bytes
Fixed Size                  5360648 bytes
Variable Size             671088640 bytes
Database Buffers          922746880 bytes
Redo Buffers                4530176 bytes
Database mounted.
SQL> recover database;
ORA-00283: recovery session canceled due to errors
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/opt/oracle/oradata/FREE/redo03.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7

由于所有redo均被删除(包含当前redo),因此只能强制resetlogs方式打开库,尝试打开数据库

SQL> shutdown immediate;
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
SQL> startup mount pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area 1603726344 bytes
Fixed Size                  5360648 bytes
Variable Size             671088640 bytes
Database Buffers          922746880 bytes
Redo Buffers                4530176 bytes
Database mounted.
SQL> recover database until cancel;
ORA-00279: change 4244588 generated at 05/03/2024 15:23:22 needed for thread 1
ORA-00289: suggestion :
/opt/oracle/product/23ai/dbhomeFree/dbs/arch1_6_1167842962.dbf
ORA-00280: change 4244588 for thread 1 is in sequence #6


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/opt/oracle/oradata/FREE/system01.dbf'


ORA-01112: media recovery not started

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by irrecoverable error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [],
[], [], [], [], [], [], []
Process ID: 5596
Session ID: 29 Serial number: 63204

数据库遇到常见的ORA-600 kcbzib_kcrsds_1错误,这类错误类似oracle在12c之前版本中的ORA-600 2662错误
ORA-600 kcbzib_kcrsds_1报错
存储故障,强制拉库报ORA-600 kcbzib_kcrsds_1处理
redo异常强制拉库报ORA-600 kcbzib_kcrsds_1修复
ORA-00603 ORA-01092 ORA-600 kcbzib_kcrsds_1
这种一般就是scn问题,通过修改数据库scn,数据库启动报ORA-16433错误

[oracle@xifenfei ~]$ sqlplus sys/oracle@free as sysdba

SQL*Plus: Release 23.0.0.0.0 - Production on Fri May 3 15:49:00 2024
Version 23.4.0.24.05

Copyright (c) 1982, 2024, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount pfile='/tmp/pfile';
ORACLE instance started.

Total System Global Area 1603726344 bytes
Fixed Size                  5360648 bytes
Variable Size             671088640 bytes
Database Buffers          922746880 bytes
Redo Buffers                4530176 bytes
SQL> alter database mount;

Database altered.

SQL> recover database;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database or pluggable database must be opened in read/write mode.


SQL> shutdown abort;
ORACLE instance shut down.

处理ctl,open数据库成功

SQL> startup nomount pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area 1603726344 bytes
Fixed Size                  5360648 bytes
Variable Size             671088640 bytes
Database Buffers          922746880 bytes
Redo Buffers                4530176 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE "FREE" NORESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 1024
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 292
  7  LOGFILE
  8    GROUP 1 '/opt/oracle/oradata/FREE/redo01.log'  SIZE 200M BLOCKSIZE 512,
  9    GROUP 2 '/opt/oracle/oradata/FREE/redo02.log'  SIZE 200M BLOCKSIZE 512,
 10    GROUP 3 '/opt/oracle/oradata/FREE/redo03.log'  SIZE 200M BLOCKSIZE 512
 11  -- STANDBY LOGFILE
 12  DATAFILE
 13    '/opt/oracle/oradata/FREE/system01.dbf',
 14    '/opt/oracle/oradata/FREE/pdbseed/system01.dbf',
 15    '/opt/oracle/oradata/FREE/sysaux01.dbf',
 16    '/opt/oracle/oradata/FREE/pdbseed/sysaux01.dbf',
 17    '/opt/oracle/oradata/FREE/users01.dbf',
 18    '/opt/oracle/oradata/FREE/pdbseed/undotbs01.dbf',
 19    '/opt/oracle/oradata/FREE/FREEPDB1/system01.dbf',
 20    '/opt/oracle/oradata/FREE/FREEPDB1/sysaux01.dbf',
 21    '/opt/oracle/oradata/FREE/FREEPDB1/undotbs01.dbf',
 22    '/opt/oracle/oradata/FREE/FREEPDB1/users01.dbf',
 23    '/opt/oracle/oradata/FREE/undotbs2.dbf'
 24  CHARACTER SET AL32UTF8
 25  ;

Control file created.

SQL> recover database;
Media recovery complete.
SQL> alter database open;

Database altered.

SQL> show pdbs;

          CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------------- ------------------------------ ---------- ----------
               2 PDB$SEED                       READ ONLY  NO
               3 FREEPDB1                       READ WRITE NO

至此当前数据库redo故障模拟和恢复基本完成