CDM备份缺少归档打开数据库报ORA-600 kcbzib_kcrsds_1故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:CDM备份缺少归档打开数据库报ORA-600 kcbzib_kcrsds_1故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户联系我们,说一个19c的库,由于产生归档较多在备份过程中把部分归档删除而导致没有备份成功,从而使得他们那边一个误操作需要恢复使用备份做不完全恢复,备份厂商进行恢复并尝试强制打开库报ORA-600 kcbzib_kcrsds_1错误
ORA-600-kcbzibz_kcrsds_1


由于客户那边使用的是CDM方式备份,可以较快的准备出来一个新环境,观察客户在应用日志过程中,文件头的scn一直不变,怀疑文件头由于begin backup冻结,对其进行dump,发现确实做了hot backup操作,而且没有end backup
scn
begin-backup

由于缺少归档,不完全恢复没有成功,begin backup 也无法正常结束,对于这种情况,先尝试调整到文件头最大的scn值,尝试打开库

SQL> alter database open resetlogs ;
alter database open resetlogs 
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [],
[], [], [], [], [], [], []
Process ID: 3949615
Session ID: 5111 Serial number: 30040


--重启到mount状态
SQL> set numw 16
col CHECKPOINT_TIME for a40
set lines 150
set pages 1000
SELECT status,
to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,checkpoint_change#,
count(*) ROW_NUM
FROM v$datafile_header
GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss'),fuzzy
ORDER BY status, checkpoint_change#, checkpoint_time;SQL> SQL> SQL> SQL>   2    3    4    5    6  

STATUS  CHECKPOINT_TIME                          FUZ CHECKPOINT_CHANGE#          ROW_NUM
------- ---------------------------------------- --- ------------------ ----------------
ONLINE  2025-02-08 22:43:01                      YES     15626238353558               56

打开库失败,只能找出来数据库最大的block中最大scn,然后调整文件头scn的值,实现数据库open

SQL> oradebug setmypid
Statement processed.
SQL> oradebug DUMPvar SGA kcsgscn_
SQL> kscn8 kcsgscn_ [060017E98, 060017EA0) = 00000000 00000000
SQL> oradebug DUMPvar SGA kcsgscn_
kscn8 kcsgscn_ [060017E98, 060017EA0) = 8CD9C896 00000E4D
SQL> 
SQL> 
SQL> 
SQL> alter database open ;
alter database open 
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/cdmbak/db/xifenfei/ob_data_D-XIFENFEI-SYSTEM_FNO-1_t13930nd'

SQL> recover database;
Media recovery complete.
SQL> alter database open;

Database altered.

SQL> 

使用exp导出客户需要的表,完成本次恢复任务
对于ORA-600 kcbzib_kcrsds_1恢复的情况,以前有过大量恢复案例,修改数据库scn即可
kcbzib_kcrsds_1报错汇总
12C数据库报ORA-600 kcbzib_kcrsds_1故障处理
存储故障,强制拉库报ORA-600 kcbzib_kcrsds_1处理
Patch SCN工具一键恢复ORA-600 kcbzib_kcrsds_1
此类故障处理太多,不一一列举,解决这个错误之后,数据库open成功,然后安排逻辑迁移即可

ORA-07445: exception encountered: core dump [expgod()+43] [IN_PAGE_ERROR]

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-07445: exception encountered: core dump [expgod()+43] [IN_PAGE_ERROR]

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库在运行过程中报O/S-Error: (OS 23) 数据错误(循环冗余检查)错误

Thu Jan 30 22:00:02 2025
Begin automatic SQL Tuning Advisor run for special tuning task  "SYS_AUTO_SQL_TUNING_TASK"
Thu Jan 30 22:00:04 2025
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_j000_12576.trc:
ORA-12012: error on auto execute of job 155962
ORA-01115: IO error reading block from file  (block # )
ORA-01110: data file 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'
ORA-27070: async read/write failed
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 23) 数据错误(循环冗余检查)。
ORA-06512: at "SYS.DBMS_STATS", line 25836
ORA-06512: at "SYS.DBMS_STATS", line 26171
End automatic SQL Tuning Advisor run for special tuning task  "SYS_AUTO_SQL_TUNING_TASK"
Fri Jan 31 02:00:00 2025
Clearing Resource Manager plan via parameter
Fri Jan 31 08:15:46 2025
Thread 1 advanced to log sequence 4420 (LGWR switch)
  Current log# 1 seq# 4420 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Fri Jan 31 10:53:57 2025
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_cjq0_1140.trc:
Fri Jan 31 10:53:57 2025
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_j000_7916.trc:
ORA-27102: out of memory
OSD-00043: 附加错误信息
O/S-Error: (OS 1455) 页面文件太小,无法完成操作。
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_cjq0_1140.trc:
Fri Jan 31 10:54:03 2025
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x18] [PC:0xB778B02, clsdcxini()+90]
ERROR: Unable to normalize symbol name for the following short stack (at offset 199):
dbgexProcessError()+193<-dbgeExecuteForError()+65<-dbgePostErrorKGE()+1726<-dbkePostKGE_kgsf()+75
<-kgeade()+560<-kgerev()+125<-kgerec5()+60<-sss_xcpt_EvalFilterEx()+1869<-sss_xcpt_EvalFilter()+174
<-.1.4_5+59<-00007FFD0245F306<-00007FFD024735AF<-00007FFD023D4AAF<-00007FFD0247231E<-clsdcxini()+90
<-clsdinit()+124<-ksdnfy()+225<-kscnfy()+778<-opirip()+86<-opidrv()+909<-sou2o()+98<-opimai_real()+299
<-opimai()+191<-BackgroundThreadStart()+693<-00007FFD020E7E94<-00007FFD02437AD1
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_j000_9920.trc  (incident=39621):
ORA-07445: exception encountered: core dump [clsdcxini()+90][ACCESS_VIOLATION][ADDR:0x18][PC:0xB778B02][UNABLE_TO_READ]
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_39621\orcl_j000_9920_i39621.trc

然后无法正常启动,报Exception [type: IN_PAGE_ERROR, ] [] [PC:0x2C9C015, expgod()+43]错误

Wed Feb 05 09:43:51 2025
Sweep [inc][39621]: completed
Successful mount of redo thread 1, with mount id 1720066005
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 3 processes
Started redo scan
Completed redo scan
 read 140 KB redo, 62 data blocks need recovery
Started redo application at
 Thread 1: logseq 4420, block 42375
Wed Feb 05 09:44:00 2025
Recovery of Online Redo Log: Thread 1 Group 1 Seq 4420 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Completed redo application of 0.09MB
Completed crash recovery at
 Thread 1: logseq 4420, block 42656, scn 94456019
 62 data blocks read, 62 data blocks written, 140 redo k-bytes read
Wed Feb 05 09:44:01 2025
Exception [type: IN_PAGE_ERROR, ] [] [PC:0x2C9C015, expgod()+43]
ERROR: Unable to normalize symbol name for the following short stack (at offset 199):
dbgexProcessError()+193<-dbgeExecuteForError()+65<-dbgePostErrorKGE()+1726
<-dbkePostKGE_kgsf()+75<-kgeade()+560<-kgerev()+125<-kgerec5()+60<-sss_xcpt_EvalFilterEx()+1869
<-sss_xcpt_EvalFilter()+174<-.1.4_5+59<-00007FFD0245F306<-00007FFD024735AF<-00007FFD023D4AAF
<-00007FFD0247231E<-expgod()+43<-xtyopncb()+241<-qctcopn()+613<-qctcopn()+392<-qctcpqb()+290
<-qctcpqbl()+52<-xtydrv()+148<-opitca()+1091<-kksLoadChild()+9008<-kxsGetRuntimeLock()+2320
<-kksfbc()+15225<-kkspsc0()+2117<-kksParseCursor()+181<-opiosq0()+2538<-opiosq()+23<-opiodr()+1662
<-rpidrus()+862<-rpidru()+154<-rpiswu2()+2757<-rpidrv()+6105<-rpisplu()+1607<-kqldFixedTableLoadCols()+345
<-kqldcor()+2534<-kglslod()+352<-kqlslod()+52<-PGOSF455_kqlsublod()+125<-kqllod()+7284<-kglobld()+1354
<-kglobpn()+1900<-kglpim()+336<-qcdlgtd()+260<-qcsfplob()+166<-qcsprfro()+903<-qcsprfro_tree()
+292<-qcsprfro_tree()+373<-qcspafq()+96
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_mmon_15468.trc  (incident=39749):
ORA-07445: exception encountered: core dump [expgod()+43] [IN_PAGE_ERROR] [] [PC:0x2C9C015] [] []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_39749\orcl_mmon_15468_i39749.trc
Wed Feb 05 09:44:02 2025
Thread 1 advanced to log sequence 4421 (thread open)
Thread 1 opened at log sequence 4421
  Current log# 2 seq# 4421 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Feb 05 09:44:02 2025
SMON: enabling cache recovery
Wed Feb 05 09:44:11 2025
Exception [type: IN_PAGE_ERROR, ] [] [PC:0x2C9C015, expgod()+43]
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_9308.trc  (incident=39781):
ORA-07445: ??????: ???? [expgod()+43] [IN_PAGE_ERROR] [] [PC:0x2C9C015] [] []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_39781\orcl_ora_9308_i39781.trc
Wed Feb 05 09:44:19 2025
PMON (ospid: 12376): terminating the instance due to error 397
Instance terminated by PMON, pid = 12376

基于上述的Exception [type: IN_PAGE_ERROR, ] [] [PC:0x2C9C015, expgod()+43]错误,第一反应就是可能由于底层损坏导致数据块损坏,dbv检查文件是否报错
dbv-system


检查系统日志确认异常
20250208203059

尝试拷贝文件也报错
QQ20250208-203152

已经比较明确由于底层问题,解决给问题之前,需要先对文件系统进行处理,然后再对恢复出来的数据文件恢复数据

2025年第一起ORA-600 16703故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:2025年第一起ORA-600 16703故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

又有一个客户数据库启动报ORA-600 16703错误
ora-600-16703


查看alert日志信息

Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_ora_8784.trc  (incident=20617):
ORA-00600: 内部错误代码, 参数: [16703], [1403], [20], [], [], [], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\incident\incdir_20617\xff_ora_8784_i20617.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_ora_8784.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [16703], [1403], [20], [], [], [], [], [], [], [], [], []
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_ora_8784.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [16703], [1403], [20], [], [], [], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 8784): terminating the instance due to error 704
Instance terminated by USER, pid = 8784
ORA-1092 signalled during: ALTER DATABASE OPEN...
opiodr aborting process unknown ospid (8784) as a result of ORA-1092

这个故障一般是由于oracle安装介质被注入了恶意脚本,在数据库运行300天之后重启数据库就会遭遇该问题:
警告:互联网中有oracle介质被注入恶意程序导致—ORA-600 16703
对于这个问题,我们这边经过多年的恢复经验积累和研究,可以实现直接open库并且不用做逻辑迁移,完美恢复之后业务直接使用,最大限度恢复数据和最小限度减少停机时间
以往类似恢复case汇总:
tab$恢复错误汇总
ORA-600 16703故障再现
tab$异常被处理之后报ORA-600 13304故障处理
ORA-600 16703直接把orachk备份表插入到tab$恢复
最近遇到几起ORA-600 16703故障(tab$被清空),请引起重视
ORA-00600: internal error code, arguments: [16703], [1403], [32]
aix平台tab$被删除可能出现ORA-600 [16703], [1403], [28]错误
ORA-00600: internal error code, arguments: [16703], [1403], [4] 原因
ORA-00600: internal error code, arguments: [16703], [1403], [4] 故障处理

ORA-600 16703故障,客户找人恢复数据库,数据库被进一步恶意破坏—ORA-00704 ORA-00922

ORA-600 ktuPopDictI_1恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 ktuPopDictI_1恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报ORA-00600: internal error code, arguments: [ktuPopDictI_1], [4]错误

[oracle@ora19c:/home/oracle]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Jan 20 21:42:28 2025
Version 19.24.0.0.0

Copyright (c) 1982, 2024, Oracle.  All rights reserved.

Connected to an idle instance.

sys@ORA19C 21:38:22> startup
ORACLE instance started.

Total System Global Area  763359928 bytes
Fixed Size                  9183928 bytes
Variable Size             457179136 bytes
Database Buffers          289406976 bytes
Redo Buffers                7589888 bytes
Database mounted.
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [ktuPopDictI_1], [4], [], [], [], [], [], [], [], [], [], []
Process ID: 3254475
Session ID: 410 Serial number: 22754

数据库alert日志报错

2025-01-20T21:38:30.411924+08:00
ALTER DATABASE OPEN
2025-01-20T21:38:30.437769+08:00
Smart fusion block transfer is disabled:
  instance mounted in exclusive mode.
2025-01-20T21:38:30.445071+08:00
Crash Recovery excluding pdb 2 which was cleanly closed.
2025-01-20T21:38:30.445125+08:00
Crash Recovery excluding pdb 3 which was cleanly closed.
2025-01-20T21:38:30.445172+08:00
Crash Recovery excluding pdb 4 which was cleanly closed.
Endian type of dictionary set to little
2025-01-20T21:38:30.459107+08:00
LGWR (PID:3254425): STARTING ARCH PROCESSES
2025-01-20T21:38:30.466458+08:00
TT00 (PID:3254477): Gap Manager starting
Starting background process ARC0
2025-01-20T21:38:30.474126+08:00
ARC0 started with pid=39, OS id=3254479 
2025-01-20T21:38:30.484228+08:00
LGWR (PID:3254425): ARC0: Archival started
LGWR (PID:3254425): STARTING ARCH PROCESSES COMPLETE
2025-01-20T21:38:30.484325+08:00
ARC0 (PID:3254479): Becoming a 'no FAL' ARCH
ARC0 (PID:3254479): Becoming the 'no SRL' ARCH
2025-01-20T21:38:30.495886+08:00
Redo log for group 3, sequence 447 is not located on DAX storage
Thread 1 opened at log sequence 447
  Current log# 3 seq# 447 mem# 0: /data/oradata/ORA19C/redo03.log
Successful open of redo thread 1
2025-01-20T21:38:30.512816+08:00
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Stopping change tracking
Undo initialization recovery: Parallel FPTR complete: start:2947972792 end:2947972831 diff:39 ms (0.0 seconds)
Undo initialization recovery: err:0 start: 2947972791 end: 2947972831 diff: 40 ms (0.0 seconds)
[3254475] Successfully onlined Undo Tablespace 2.
Undo initialization online undo segments: err:0 start: 2947972831 end: 2947972933 diff: 102 ms (0.1 seconds)
ORA-00600: internal error code, arguments: [ktuPopDictI_1], [4], [], [], [], [], [], [], [], [], [], []
Incident details in: /data/app/oracle/diag/rdbms/ora19c/ora19c/incident/incdir_38705/ora19c_ora_3254475_i38705.trc
2025-01-20T21:38:31.482586+08:00
TMON (PID:3254467): STARTING ARCH PROCESSES
Starting background process ARC1
2025-01-20T21:38:31.491744+08:00
ARC1 started with pid=41, OS id=3254483 
Starting background process ARC2
2025-01-20T21:38:31.500274+08:00
ARC2 started with pid=42, OS id=3254485 
Starting background process ARC3
2025-01-20T21:38:31.508426+08:00
ARC3 started with pid=43, OS id=3254487 
TMON (PID:3254467): ARC1: Archival started
TMON (PID:3254467): ARC2: Archival started
TMON (PID:3254467): ARC3: Archival started
TMON (PID:3254467): STARTING ARCH PROCESSES COMPLETE
2025-01-20T21:38:31.715480+08:00
*****************************************************************
An internal routine has requested a dump of selected redo.
This usually happens following a specific internal error, when
analysis of the redo logs will help Oracle Support with the
diagnosis.
It is recommended that you retain all the redo logs generated (by
all the instances) during the past 12 hours, in case additional
redo dumps are required to help with the diagnosis.
*****************************************************************
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2025-01-20T21:38:31.798619+08:00
Errors in file /data/app/oracle/diag/rdbms/ora19c/ora19c/trace/ora19c_ora_3254475.trc:
ORA-00600: internal error code, arguments: [ktuPopDictI_1], [4], [], [], [], [], [], [], [], [], [], []
2025-01-20T21:38:31.798666+08:00
Errors in file /data/app/oracle/diag/rdbms/ora19c/ora19c/trace/ora19c_ora_3254475.trc:
ORA-00600: internal error code, arguments: [ktuPopDictI_1], [4], [], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [ktuPopDictI_1], [4], [], [], [], [], [], [], [], [], [], []
Incident details in: /data/app/oracle/diag/rdbms/ora19c/ora19c/incident/incdir_38706/ora19c_ora_3254475_i38706.trc
opiodr aborting process unknown ospid (3254475) as a result of ORA-603
2025-01-20T21:38:32.356148+08:00
ORA-603 : opitsk aborting process
License high water mark = 1
USER(prelim) (ospid: 3254475): terminating the instance due to ORA error 600

分析trace信息

[TOC00001]
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [ktuPopDictI_1], [4], [], [], [], [], [], [], [], [], [], []

[TOC00001-END]
[TOC00002]
========= Dump for incident 38706 (ORA 603) ========

*** 2025-01-20T21:38:31.823904+08:00
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
[TOC00003]
----- Current SQL Statement for this session (sql_id=1h50ks4ncswfn) -----
ALTER DATABASE OPEN
[TOC00003-END]

[TOC00004]
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst1()+95         call     kgdsdst()            7FFEA4EB7360 000000002
                                                   7FFEA4EB15B0 ? 7FFEA4EB16C8 ?
                                                   000000000 000000000
ksedst()+58          call     ksedst1()            000000000 000000001
                                                   7FFEA4EB15B0 ? 7FFEA4EB16C8 ?
                                                   000000000 ? 000000000 ?
dbkedDefDump()+2434  call     ksedst()             000000000 000000001 ?
7                                                  7FFEA4EB15B0 ? 7FFEA4EB16C8 ?
                                                   000000000 ? 000000000 ?
ksedmp()+577         call     dbkedDefDump()       000000003 000000002
                                                   7FFEA4EB15B0 ? 7FFEA4EB16C8 ?
                                                   000000000 ? 000000000 ?
dbgexPhaseII()+2092  call     ksedmp()             0000003EB 000000002 ?
                                                   7FFEA4EB15B0 ? 7FFEA4EB16C8 ?
                                                   000000000 ? 000000000 ?
dbgexProcessError()  call     dbgexPhaseII()       7F2A059ED6D8 7F2A002BF148
+1871                                              7FFEA4EB8E00 7FFEA4EB16C8 ?
                                                   000000000 ? 000000000 ?
dbgePostErrorKGE()+  call     dbgexProcessError()  7F2A059ED6D8 7F2A002BF148
1851                                               000000001 000000000
                                                   000000000 ? 000000000 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   7F2A05A2D9C0 7F2A058D0050
71                                                 00000025B 000000000 ?
                                                   000000000 ? 000000000 ?
kgeade()+339         call     dbkePostKGE_kgsf()   7F2A05A2D9C0 7F2A058D0050
                                                   00000025B 000000000 ?
                                                   000000000 ? 000000000 ?
kgefecl()+184        call     kgeade()             7F2A05A2D9C0 ? 7F2A05A2DC08 ?
                                                   7F2A058D0050 ? 00000025B ?
                                                   000000000 000000000
adbdrv_options()+48  call     kgefecl()            7F2A05A2D9C0 7F2A058D0050
548                                                000000444 000000001 ?
                                                   014971F9C ? 014973858 ?
opiexe()+31984       call     adbdrv_options()     000000000 7F2A058D0050 ?
                                                   000000444 ? 000000001 ?
                                                   014971F9C ? 014973858 ?
opiosq0()+4560       call     opiexe()             000000004 7F2A058D0050 ?
                                                   7FFEA4EC16D0 000000001 ?
                                                   014971F9C ? 014973858 ?
kpooprx()+287        call     opiosq0()            000000003 7F2A058D0050 ?
                                                   7F2A05A2D9C0 ? 0000000A4
                                                   000000000 000000023
kpoal8()+838         call     kpooprx()            7FFEA4EC5824 7FFEA4EC2F40
                                                   000000013 000000001 000000000
                                                   0000000A4
opiodr()+1253        call     kpoal8()             00000005E 000000026
                                                   7FFEA4EC5820 000000001 ?
                                                   000000000 ? 0000000A4 ?
ttcpip()+1216        call     opiodr()             00000005E 000000026
                                                   7FFEA4EC5820 ? 000000000
                                                   000000000 ? 0000000A4 ?
opitsk()+1916        call     ttcpip()             7F2A05A57B30 ? 000000026 ?
                                                   7FFEA4EC5820 000000000 ?
                                                   7FFEA4EC5280 7FFEA4EC5A80 ?
opiino()+936         call     opitsk()             000000000 000000000
                                                   7FFEA4EC5820 ? 000000000 ?
                                                   7FFEA4EC5280 ? 7FFEA4EC5A80 ?
opiodr()+1253        call     opiino()             00000003C 000000004
                                                   7FFEA4EC7418 000000000 ?
                                                   7FFEA4EC5280 ? 7FFEA4EC5A80 ?
opidrv()+1067        call     opiodr()             00000003C 000000004
                                                   7FFEA4EC7418 ? 000000000
                                                   7FFEA4EC5280 ? 7FFEA4EC5A80 ?
sou2o()+165          call     opidrv()             00000003C 000000004
                                                   7FFEA4EC7418 000000000 ?
                                                   7FFEA4EC5280 ? 7FFEA4EC5A80 ?
opimai_real()+422    call     sou2o()              7FFEA4EC73F0 00000003C
                                                   000000004 7FFEA4EC7418
                                                   7FFEA4EC5280 ? 7FFEA4EC5A80 ?
ssthrdmain()+417     call     opimai_real()        000000000 7FFEA4EC7C08
                                                   000000004 ? 7FFEA4EC7418 ?
                                                   7FFEA4EC5280 ? 7FFEA4EC5A80 ?
main()+256           call     ssthrdmain()         000000000 000000002
                                                   7FFEA4EC7C08 000000001
                                                   000000000 7FFEA4EC5A80 ?
__libc_start_main()  call     main()               000000002 7FFEA4EC7E58
+243                                               7FFEA4EC7C08 ? 000000001 ?
                                                   000000000 ? 7FFEA4EC5A80 ?
_start()+46          call     __libc_start_main()  000DFEF50 000000002
                                                   7FFEA4EC7E58 00746DD60 ?
                                                   000000000 ? 7FFEA4EC5A80 ?
[TOC00004-END]


[TOC00005]
--------------------- Binary Stack Dump ---------------------

对启动过程进行跟踪,确认报错具体位置

PARSING IN CURSOR #140457326129448 len=45 dep=1 tim=11537999627952 hv=2164165332 ad='69329ae8'sqlid='8su8qaa0gx2qn'
select dataobj# from obj$ where name like :1
END OF STMT
PARSE #140457326129448:c=15,e=15,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=1478545678,tim=11537999627952
BINDS #140457326129448:

 Bind#0
  oacdty=01 mxl=32(07) mxlc=00 mal=00 scl=00 pre=00
  oacflg=20 fl2=0000 frm=01 csi=873 siz=32 off=0
  kxsbbbfp=7fbec507eb20  bln=32  avl=07  flg=05
  value="I_UNDO2"
EXEC #140457326129448:c=51,e=51,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=1478545678,tim=11537999628044
FETCH #140457326129448:c=175,e=175,p=0,cr=11,cu=0,mis=0,r=1,dep=1,og=4,plh=1478545678,tim=11537999628226
STAT #140457326129448 id=1 cnt=1 pid=0 pos=1 obj=18 op='TABLE ACCESS BY INDEX ROWID BATCHED OBJ$ (cr=11  card=1)'
STAT #140457326129448 id=2 cnt=1 pid=1 pos=1 obj=37 op='INDEX SKIP SCAN I_OBJ2 (cr=10 time=177 us cost=26 size=0 card=1)'
CLOSE #140457326129448:c=40,e=40,dep=1,type=1,tim=11537999628290
PARSE #140457326129448:c=5,e=5,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=1478545678,tim=11537999628302
BINDS #140457326129448:

 Bind#0
  oacdty=01 mxl=32(09) mxlc=00 mal=00 scl=00 pre=00
  oacflg=20 fl2=0000 frm=01 csi=873 siz=32 off=0
  kxsbbbfp=7fbec507eb20  bln=32  avl=09  flg=05
  value="UNDOHIST$"
EXEC #140457326129448:c=31,e=31,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=1478545678,tim=11537999628342
WAIT #140457326129448: nam='db file sequential read' ela= 6 file#=1 block#=243 blocks=1 obj#=18 tim=11537999628370
FETCH #140457326129448:c=33,e=33,p=1,cr=5,cu=0,mis=0,r=1,dep=1,og=4,plh=1478545678,tim=11537999628381
CLOSE #140457326129448:c=4,e=4,dep=1,type=3,tim=11537999628409
PARSE #140457326129448:c=6,e=6,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=1478545678,tim=11537999628430
BINDS #140457326129448:

 Bind#0
  oacdty=01 mxl=32(11) mxlc=00 mal=00 scl=00 pre=00
  oacflg=20 fl2=0000 frm=01 csi=873 siz=32 off=0
  kxsbbbfp=7fbec507eb20  bln=32  avl=11  flg=05
  value="I_UNDOHIST1"
EXEC #140457326129448:c=39,e=39,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=1478545678,tim=11537999628485
WAIT #140457326129448: nam='db file sequential read' ela= 649 file#=1 block#=78527 blocks=1 obj#=37 tim=11537999629346
WAIT #140457326129448: nam='db file sequential read' ela= 747 file#=1 block#=78505 blocks=1 obj#=37 tim=11537999630122
WAIT #140457326129448: nam='db file sequential read' ela= 628 file#=1 block#=23459 blocks=1 obj#=37 tim=11537999630775
WAIT #140457326129448: nam='db file sequential read' ela= 470 file#=1 block#=78552 blocks=1 obj#=37 tim=11537999631268
WAIT #140457326129448: nam='db file sequential read' ela= 502 file#=1 block#=92776 blocks=1 obj#=37 tim=11537999631797
WAIT #140457326129448: nam='db file sequential read' ela= 512 file#=1 block#=78585 blocks=1 obj#=37 tim=11537999632331
WAIT #140457326129448: nam='db file sequential read' ela= 538 file#=1 block#=78557 blocks=1 obj#=37 tim=11537999632892
WAIT #140457326129448: nam='db file sequential read' ela= 520 file#=1 block#=19641 blocks=1 obj#=37 tim=11537999633440
WAIT #140457326129448: nam='db file sequential read' ela= 502 file#=1 block#=23486 blocks=1 obj#=37 tim=11537999633971
WAIT #140457326129448: nam='db file sequential read' ela= 465 file#=1 block#=23504 blocks=1 obj#=37 tim=11537999634461
WAIT #140457326129448: nam='db file sequential read' ela= 656 file#=1 block#=23488 blocks=1 obj#=37 tim=11537999635141
WAIT #140457326129448: nam='db file sequential read' ela= 429 file#=1 block#=23501 blocks=1 obj#=37 tim=11537999635596
WAIT #140457326129448: nam='db file sequential read' ela= 621 file#=1 block#=92686 blocks=1 obj#=37 tim=11537999636240
WAIT #140457326129448: nam='db file sequential read' ela= 1027 file#=1 block#=92678 blocks=1 obj#=37 tim=11537999637307
WAIT #140457326129448: nam='db file sequential read' ela= 494 file#=1 block#=92680 blocks=1 obj#=37 tim=11537999637831
WAIT #140457326129448: nam='db file sequential read' ela= 515 file#=1 block#=92682 blocks=1 obj#=37 tim=11537999638377
WAIT #140457326129448: nam='db file sequential read' ela= 559 file#=1 block#=92684 blocks=1 obj#=37 tim=11537999638975
WAIT #140457326129448: nam='db file sequential read' ela= 478 file#=1 block#=92683 blocks=1 obj#=37 tim=11537999639502
WAIT #140457326129448: nam='db file sequential read' ela= 402 file#=1 block#=92687 blocks=1 obj#=37 tim=11537999639951
WAIT #140457326129448: nam='db file sequential read' ela= 465 file#=1 block#=92691 blocks=1 obj#=37 tim=11537999640453
WAIT #140457326129448: nam='db file sequential read' ela= 629 file#=1 block#=92694 blocks=1 obj#=37 tim=11537999641112
WAIT #140457326129448: nam='db file sequential read' ela= 507 file#=1 block#=109829 blocks=1 obj#=37 tim=11537999641651
WAIT #140457326129448: nam='db file sequential read' ela= 467 file#=1 block#=109831 blocks=1 obj#=37 tim=11537999642150
WAIT #140457326129448: nam='db file sequential read' ela= 525 file#=1 block#=109833 blocks=1 obj#=37 tim=11537999642695
WAIT #140457326129448: nam='db file sequential read' ela= 823 file#=1 block#=109837 blocks=1 obj#=37 tim=11537999643540
WAIT #140457326129448: nam='db file sequential read' ela= 553 file#=1 block#=109834 blocks=1 obj#=37 tim=11537999644111
WAIT #140457326129448: nam='db file sequential read' ela= 509 file#=1 block#=109835 blocks=1 obj#=37 tim=11537999644650
FETCH #140457326129448:c=1777,e=16184,p=27,cr=38,cu=0,mis=0,r=0,dep=1,og=4,plh=1478545678,tim=11537999644675
2025-01-20T21:40:02.951778+08:00
ORA-00600: internal error code, arguments: [ktuPopDictI_1], [4], [], [], [], [], [], [], [], [], [], []

<error barrier> at 0x7ffe9566b3c0 placed dbsdrv.c@5141
ORA-00600: internal error code, arguments: [ktuPopDictI_1], [4], [], [], [], [], [], [], [], [], [], []
<error barrier> at 0x7ffe9566b3c0 placed dbsdrv.c@5141
ORA-00600: internal error code, arguments: [ktuPopDictI_1], [4], [], [], [], [], [], [], [], [], [], []
2025-01-20T21:40:04.951374+08:00
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [ktuPopDictI_1], [4], [], [], [], [], [], [], [], [], [], []

通过上述定位确认是select dataobj# from obj$ where name like :1这个sql在查询记录时报错,通过一些技巧绕过该sql,实现数据库正常open

19c非归档数据库断电导致ORA-00742故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:19c非归档数据库断电导致ORA-00742故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户一套运行在win平台,非归档的19c数据库,由于异常断电导致数据库启动报ORA-01113,进行recover操作之后报ORA-00742
ora-00742


open之时alert日志报错信息

2025-01-18T00:17:52.205669+08:00
alter database open
2025-01-18T00:17:53.417839+08:00
Ping without log force is disabled:
  instance mounted in exclusive mode.
2025-01-18T00:17:53.436858+08:00
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\his\his\trace\his_ora_4428.trc:
ORA-01113: 文件 1 需要介质恢复
ORA-01110: 数据文件 1: 'D:\APP\ADMINISTRATOR\ORADATA\HIS\SYSTEM01.DBF'
2025-01-18T00:17:53.442863+08:00
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\his\his\trace\his_ora_4428.trc:
ORA-01113: 文件 1 需要介质恢复
ORA-01110: 数据文件 1: 'D:\APP\ADMINISTRATOR\ORADATA\HIS\SYSTEM01.DBF'
ORA-1113 signalled during: alter database open...

recover database 数据库的alert日志报错Media Recovery failed with error 742和
ORA-01110 ORA-01208等错误

2025-01-18T00:20:10.196227+08:00
ALTER DATABASE RECOVER  database  
2025-01-18T00:20:10.221244+08:00
Media Recovery Start
 Started logmerger process
2025-01-18T00:20:10.459413+08:00
WARNING! Recovering data file 1 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 3 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 4 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 7 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 60 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 64 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 65 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 66 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 67 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
2025-01-18T00:20:10.599512+08:00
Parallel Media Recovery started with 12 slaves
2025-01-18T00:20:10.664559+08:00
Recovery of Online Redo Log: Thread 1 Group 3 Seq 2097 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\HIS\REDO03.LOG
2025-01-18T00:20:12.644962+08:00
Media Recovery failed with error 742
2025-01-18T00:20:12.759043+08:00
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\his\his\trace\his_mz00_4036.trc:
ORA-01110: 数据文件 1: 'D:\APP\ADMINISTRATOR\ORADATA\HIS\SYSTEM01.DBF'
ORA-01208: 数据文件是旧的版本 - 不能访问当前版本
2025-01-18T00:20:13.135309+08:00
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\his\his\trace\his_mz00_4036.trc:
ORA-01110: 数据文件 3: 'D:\APP\ADMINISTRATOR\ORADATA\HIS\SYSAUX01.DBF'
ORA-01208: 数据文件是旧的版本 - 不能访问当前版本
2025-01-18T00:20:13.455536+08:00
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\his\his\trace\his_mz00_4036.trc:
ORA-01110: 数据文件 4: 'D:\APP\ADMINISTRATOR\ORADATA\HIS\UNDOTBS01.DBF'
ORA-01208: 数据文件是旧的版本 - 不能访问当前版本
2025-01-18T00:20:14.408212+08:00
ORA-283 signalled during: ALTER DATABASE RECOVER  database  ...

尝试recover datafile 1

SQL> RECOVER DATAFILE 1;
ORA-00283: 恢复会话因错误而取消
ORA-00742: 日志读取在线程 1 序列 2097 块 296728 中检测到写入丢失情况
ORA-00312: 联机日志 3 线程 1: 'D:\APP\ADMINISTRATOR\ORADATA\HIS\REDO03.LOG'

对于这种情况,比较明显是redo文件有写丢失,导致数据库无法正常的应用redo日志进行恢复,从而无法正常open.这种情况,只能只能选择屏蔽一致性,尝试强制打开数据库

C:\Users\Administrator\Desktop\check_db>sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Sat Jan 18 00:59:28 2025
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> recover database using backup controlfile until cancel;
ORA-00279: ?? 10103231167 (? 01/17/2025 09:20:12 ??) ???? 1 ????
ORA-00289: ??:
D:\APP\ADMINISTRATOR\PRODUCT\19.0.0\DBHOME_1\RDBMS\ARC0000002097_1079211060.0001
ORA-00280: ?? 10103231167 (???? 1) ??? #2097 ?


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
D:\APP\ADMINISTRATOR\ORADATA\HIS\REDO03.LOG
ORA-00283: ??????????????????????????????
ORA-00742: ????????????????????? 1 ?????? 2097 ??? 296960
??????????????????????????????
ORA-00334: ????????????: 'D:\APP\ADMINISTRATOR\ORADATA\HIS\REDO03.LOG'


ORA-01112: ???????


SQL> alter database open resetlogs;

Database altered.

alert日志对应的信息

2025-01-18T01:00:15.756033+08:00
alter database open resetlogs
2025-01-18T01:00:15.903141+08:00
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 10103247614 time 
.... (PID:4824): Clearing online redo logfile 1 D:\APP\ADMINISTRATOR\ORADATA\HIS\REDO01.LOG
.... (PID:4824): Clearing online redo logfile 2 D:\APP\ADMINISTRATOR\ORADATA\HIS\REDO02.LOG
.... (PID:4824): Clearing online redo logfile 3 D:\APP\ADMINISTRATOR\ORADATA\HIS\REDO03.LOG
Clearing online log 1 of thread 1 sequence number 2098
Clearing online log 2 of thread 1 sequence number 2096
Clearing online log 3 of thread 1 sequence number 2097
2025-01-18T01:00:19.381697+08:00
.... (PID:4824): Clearing online redo logfile 1 complete
.... (PID:4824): Clearing online redo logfile 2 complete
.... (PID:4824): Clearing online redo logfile 3 complete
Resetting resetlogs activation ID 3599991024 (0xd69380f0)
Online log D:\APP\ADMINISTRATOR\ORADATA\HIS\REDO01.LOG: Thread 1 Group 1 was previously cleared
Online log D:\APP\ADMINISTRATOR\ORADATA\HIS\REDO02.LOG: Thread 1 Group 2 was previously cleared
Online log D:\APP\ADMINISTRATOR\ORADATA\HIS\REDO03.LOG: Thread 1 Group 3 was previously cleared
2025-01-18T01:00:19.550821+08:00
Setting recovery target incarnation to 2
2025-01-18T01:00:20.418458+08:00
Ping without log force is disabled:
  instance mounted in exclusive mode.
Initializing SCN for created control file
Database SCN compatibility initialized to 3
2025-01-18T01:00:25.466167+08:00
Endian type of dictionary set to little
2025-01-18T01:00:25.476174+08:00
Assigning activation ID 3711463735 (0xdd387137)
2025-01-18T01:00:25.491185+08:00
TT00 (PID:3332): Gap Manager starting
2025-01-18T01:00:25.507197+08:00
Redo log for group 1, sequence 1 is not located on DAX storage
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\HIS\REDO01.LOG
Successful open of redo thread 1
2025-01-18T01:00:26.162679+08:00
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
stopping change tracking
2025-01-18T01:00:26.183694+08:00
TT03 (PID:1896): Sleep 5 seconds and then try to clear SRLs in 2 time(s)
2025-01-18T01:00:27.465636+08:00
Undo initialization recovery: err:0 start: 3158125 end: 3158390 diff: 265 ms (0.3 seconds)
Undo initialization online undo segments: err:0 start: 3158390 end: 3158390 diff: 0 ms (0.0 seconds)
Undo initialization finished serial:0 start:3158125 end:3158406 diff:281 ms (0.3 seconds)
Dictionary check beginning
Tablespace 'TEMP' #3 found in data dictionary,
but not in the controlfile. Adding to controlfile.
Dictionary check complete
Verifying minimum file header compatibility for tablespace encryption..
Verifying file header compatibility for tablespace encryption completed for pdb 0
Database Characterset is AL32UTF8
2025-01-18T01:00:28.790609+08:00
No Resource Manager plan active
2025-01-18T01:00:30.086561+08:00
replication_dependency_tracking turned off (no async multimaster replication found)
2025-01-18T01:00:31.185369+08:00
TT03 (PID:1896): Sleep 10 seconds and then try to clear SRLs in 3 time(s)
2025-01-18T01:00:31.536626+08:00
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Starting background process AQPC
2025-01-18T01:00:31.638701+08:00
AQPC started with pid=38, OS id=4340 
2025-01-18T01:00:32.717495+08:00
Starting background process CJQ0
2025-01-18T01:00:32.726501+08:00
CJQ0 started with pid=40, OS id=1236 
Completed: alter database open resetlogs

数据库没有明显报错,直接resetlogs成功,直接逻辑导出数据,导入新库,完成本次恢复工作

2025年首个故障恢复—ORA-600 kcbzib_kcrsds_1

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:2025年首个故障恢复—ORA-600 kcbzib_kcrsds_1

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个12.2.0.1的库由于某种原因引起的双机切换,导致数据库无法正常mount

2025-01-04T15:45:44.424193+08:00
alter database mount
2025-01-04T15:45:48.491054+08:00
Network throttle feature is disabled as mount time

2025-01-04T15:45:48.601366+08:00
LGWR (ospid: 34014): terminating the instance
2025-01-04T15:45:48.602480+08:00
System state dump requested by (instance=1, osid=34014 (LGWR)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/xifenfei/trace/xifenfei_diag_33978_20250104154548.trc
2025-01-04T15:45:48.790674+08:00
Dumping diagnostic data in directory=[cdmp_20250104154548], requested by (instance=1, osid=34014 (LGWR))
2025-01-04T15:45:49.915068+08:00
Instance terminated by LGWR, pid = 34014

这个错误相对比较明显,是由于ctl异常导致,通过重建ctl,然后mount库,利用Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)脚本进行检测发现所有数据文件头的checkpoint 信息被冻结在 2024-11-29 19:00:29 (scn 2112302221),分析alert日志数据库在此后20天中正常提供服务,业务运行都正常,客户反馈在这个冻结checkpoint信息的时间点,使用备份一体机发起过备份,之后就没有再备份了.
当时急着恢复数据库,没有对文件头进行dump不然应该可以发现类似begin backup的信息,类似这样(测试环境重现):

DATA FILE #1:
  name #7: /u01/app/oracle/oradata/xifenfei/system01.dbf
creation size=0 block size=8192 status=0xe head=7 tail=7 dup=1
 tablespace 0, index=1 krfil=1 prev_file=0
 unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
 Checkpoint cnt:625 scn: 0x0105.0106deef 01/04/2025 22:02:50
 Stop scn: 0xffff.ffffffff 12/14/2024 08:15:07
 Creation Checkpointed at scn:  0x0000.00000007 08/24/2013 11:37:33
 thread:0 rba:(0x0.0.0)
 enabled  threads:  00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000
 Offline scn: 0x0000.000e2005 prev_range: 0
 Online Checkpointed at scn:  0x0000.000e2006 03/20/2024 20:53:56
 thread:1 rba:(0x1.2.0)
 enabled  threads:  01000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000
 Hot Backup end marker scn: 0x0000.00000000
 aux_file is NOT DEFINED
 Plugged readony: NO
 Plugin scnscn: 0x0000.00000000
 Plugin resetlogs scn/timescn: 0x0000.00000000 01/01/1988 00:00:00
 Foreign creation scn/timescn: 0x0000.00000000 01/01/1988 00:00:00
 Foreign checkpoint scn/timescn: 0x0000.00000000 01/01/1988 00:00:00
 Online move state: 0
 V10 STYLE FILE HEADER:
        Compatibility Vsn = 186647552=0xb200400
        Db ID=1780931490=0x6a26dba2, Db Name='XIFENFEI'
        Activation ID=0=0x0
        Control Seq=32953021=0x1f6d2bd, File size=98560=0x18100
        File Number=1, Blksiz=8192, File Type=3 DATA
Tablespace #0 - SYSTEM  rel_fn:1
Creation   at   scn: 0x0000.00000007 08/24/2013 11:37:33
Backup taken at scn: 0x0105.0106deef 01/04/2025 22:02:50 thread:1    <====注意
 reset logs count:0x45636764 scn: 0x0000.000e2006
 prev reset logs count:0x3121c97a scn: 0x0000.00000001
 recovered at 12/14/2024 08:36:35
 status:0x2001 root dba:0x00400208 chkpt cnt: 625 ctl cnt:624
begin-hot-backup file size: 98560                        <====注意
Checkpointed at scn:  0x0105.0106deef 01/04/2025 22:02:50
 thread:1 rba:(0x205.fdd9.10)
 enabled  threads:  01000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000
Backup Checkpointed at scn:  0x0105.0106df14 01/04/2025 22:03:20   <====注意
 thread:1 rba:(0x209.2.10)
 enabled  threads:  01000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000
External cache id: 0x0 0x0 0x0 0x0
Absolute fuzzy scn: 0x0000.00000000
Recovery fuzzy scn: 0x0000.00000000 01/01/1988 00:00:00
Terminal Recovery Stamp  01/01/1988 00:00:00
Platform Information:    Creation Platform ID: 13
Current Platform ID: 13 Last Platform ID: 13

基于上述情况,尝试强制打开库,报ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1]错误
ora-600 kzbzib_kcrsds_1


对于这个情况,以前有过大量恢复案例,修改数据库scn即可
kcbzib_kcrsds_1报错汇总
12C数据库报ORA-600 kcbzib_kcrsds_1故障处理
存储故障,强制拉库报ORA-600 kcbzib_kcrsds_1处理
https://www.xifenfei.com/2023/12/patch-scn-ora-600-kcbzib_kcrsds_1.html
此类故障处理太多,不一一列举,解决这个错误之后,数据库open成功,然后安排逻辑迁移即可

断电引起的ORA-08102: 未找到索引关键字, 对象号 39故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:断电引起的ORA-08102: 未找到索引关键字, 对象号 39故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

最近有客户在虚拟化平台运行oracle,由于机房掉电,导致oracle数据库无法正常启动,通过第三方恢复,oracle被强制拉起,但是无法进行ddl操作,比如创建表报ORA-08102: 未找到索引关键字, 对象号 39, 文件 1, 块 122448 (2) 错误
ORA-08102


通过obj#确认具体对象为i_obj#(也就是obj$对象上的一个index)
I_OBJ4

由于这类对象属于数据库的底层核心对象,无法直接rebulid他们,根据以往经验,可以通过bbed对其进行修复,或者参考类似文章进行重建:
分享I_OBJ4 ORA-8102故障恢复案例
使用bbed 修复I_OBJ4 index 报ORA-8102错误
通过bbed修改obj$中dataobj$重现I_OBJ4索引报ORA-08102错误
bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决
这个问题解决之后,该客户还有另外一个问题需要解决(不然数据库运行一段时间之后就会crash)

Wed Dec 18 09:13:03 2024
Errors in file E:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_2536.trc  (incident=1105672):
ORA-00600: 内部错误代码, 参数: [13013], [5001], [268], [8459081], [1], [8459081], [3], [], [], [], [], []
Incident details in: E:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_1105672\orcl_smon_2536_i1105672.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.

这个问题本质就是SMON_SCN_TIME表的异常导致(一般13013是由于表和index不一致导致),对于这类问题处理,参考:
关于SMON_SCN_TIME若干问题说明
处理完上述两个明显故障之后,然后使用expdp不落地方式把客户数据迁移到新库,完成本次恢复任务

ORA-00227: corrupt block detected in control file

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-00227: corrupt block detected in control file

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

由于服务器断电,导致oracle数据库无法正常启动,recover报ORA-00353 ORA-00312错

SQL> recover datafile 1;
ORA-00283: recovery session canceled due to errors
ORA-00368: checksum error in redo log block
ORA-00353: log corruption near block 8 change 237896529 time 12/03/2024 22:03:32
ORA-00312: online log 1 thread 1: 'E:\ORADATA\xifenfei\REDO01.LOG'

报错提示比较明显,是由于oracle恢复需要的redo损坏,导致无法进行正常恢复,这种情况下,只能尝试强制打开库

SQL> recover database until cancel;
ORA-00279: change 237857808 generated at 12/03/2024 07:06:31 needed for thread 1
ORA-00289: suggestion : C:\ORACLE\PRODUCT\10.2.0.3\DB_1\RDBMS\ARC20892_0929553713.001
ORA-00280: change 237857808 for thread 1 is in sequence #20892


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: 'E:\ORADATA\xifenfei\SYSTEM01.DBF'


ORA-01112: media recovery not started


SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced

对于这种ORA-01092错误,需要查看alert日志确认具体报错原因

Tue Dec 17 22:03:59 2024
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 237857808
Resetting resetlogs activation ID 2015910641 (0x78285af1)
Tue Dec 17 22:04:00 2024
Setting recovery target incarnation to 2
Tue Dec 17 22:04:00 2024
Advancing SCN to 16106127360 according to _minimum_giga_scn
Tue Dec 17 22:04:00 2024
Assigning activation ID 2274407914 (0x8790b5ea)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: E:\ORADATA\xifenfei\REDO01.LOG
Successful open of redo thread 1
Tue Dec 17 22:04:01 2024
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Dec 17 22:04:01 2024
SMON: enabling cache recovery
Tue Dec 17 22:04:01 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\udump\xifenfei_ora_816.trc:
ORA-00704: bootstrap process failure
ORA-39700: database must be opened with UPGRADE option

Tue Dec 17 22:04:01 2024
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Tue Dec 17 22:04:01 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_pmon_2472.trc:
ORA-00704: bootstrap process failure

Tue Dec 17 22:04:01 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_reco_2576.trc:
ORA-00704: bootstrap process failure

Tue Dec 17 22:04:01 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_smon_2584.trc:
ORA-00704: bootstrap process failure

Tue Dec 17 22:04:02 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_ckpt_2216.trc:
ORA-00704: bootstrap process failure

Tue Dec 17 22:04:02 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_lgwr_1556.trc:
ORA-00704: bootstrap process failure

Tue Dec 17 22:04:02 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_dbw0_1528.trc:
ORA-00704: bootstrap process failure

Tue Dec 17 22:04:02 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_psp0_2896.trc:
ORA-00704: bootstrap process failure

Tue Dec 17 22:04:02 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_mman_904.trc:
ORA-00704: bootstrap process failure

Tue Dec 17 22:04:02 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_dbw1_1732.trc:
ORA-00704: bootstrap process failure

Instance terminated by USER, pid = 816
ORA-1092 signalled during: alter database open resetlogs...

由于对oracle粗心对于oracle版本判断失误,导致打开数据库失败,使用正确版本打开数据库发现ctl有报错,导致打开依旧失败(这种错误一般比较少见,大部分ctl异常都是在oracle mount状态无法成功)

Tue Dec 17 22:10:48 2024
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1 Reading mem 0
  Mem# 0: E:\ORADATA\xifenfei\REDO01.LOG
Tue Dec 17 22:10:48 2024
Completed redo application
Tue Dec 17 22:10:48 2024
Completed crash recovery at
 Thread 1: logseq 1, block 3, scn 16106147363
 0 data blocks read, 0 data blocks written, 1 redo blocks read
Tue Dec 17 22:10:48 2024
Read from controlfile member 'E:\ORADATA\xifenfei\CONTROL01.CTL' has found a corrupted block (blk# 432, seq# 132194)
Hex dump of (file 0, block 432) in trace file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_lgwr_2728.trc
Corrupt block relative dba: 0x000001b0 (file 0, block 432)
Fractured block found during control file block read
Data in bad block:
 type: 0 format: 0 rdba: 0x00000000
 last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x03e71501
 check value in block header: 0x0
 block checksum disabled
Hex dump of (file 0, block 432) in trace file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_lgwr_2728.trc
Corrupt block relative dba: 0x000001b0 (file 0, block 432)
Fractured block found during control file block read
Data in bad block:
 type: 0 format: 0 rdba: 0x00000000
 last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x03e71501
 check value in block header: 0x0
 block checksum disabled
Tue Dec 17 22:10:48 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_lgwr_2728.trc:
ORA-00202: control file: 'E:\ORADATA\xifenfei\CONTROL01.CTL'

Tue Dec 17 22:10:48 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_lgwr_2728.trc:
ORA-00227: corrupt block detected in control file: (block 432, # blocks 1)
ORA-00202: control file: 'E:\ORADATA\xifenfei\CONTROL01.CTL'

LGWR: terminating instance due to error 227
Tue Dec 17 22:10:48 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_pmon_2056.trc:
ORA-00227: corrupt block detected in control file: (block , # blocks )

Tue Dec 17 22:10:49 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_dbw0_1636.trc:
ORA-00227: corrupt block detected in control file: (block , # blocks )

Tue Dec 17 22:10:49 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_reco_2468.trc:
ORA-00227: corrupt block detected in control file: (block , # blocks )

Tue Dec 17 22:10:49 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_smon_2996.trc:
ORA-00227: corrupt block detected in control file: (block , # blocks )

Tue Dec 17 22:10:49 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_ckpt_2292.trc:
ORA-00227: corrupt block detected in control file: (block , # blocks )

Tue Dec 17 22:10:49 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_dbw1_508.trc:
ORA-00227: corrupt block detected in control file: (block , # blocks )

Tue Dec 17 22:10:49 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_mman_1728.trc:
ORA-00227: corrupt block detected in control file: (block , # blocks )

Tue Dec 17 22:10:49 2024
Errors in file c:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_psp0_1724.trc:
ORA-00227: corrupt block detected in control file: (block , # blocks )

Instance terminated by LGWR, pid = 2728

重建ctl,打开数据库成功,导出数据,完成本次恢复任务

Oracle Recovery Tools工具一键解决ORA-00376 ORA-01110故障(文件offline)

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle Recovery Tools工具一键解决ORA-00376 ORA-01110故障(文件offline)

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户在win上面迁移数据文件,由于原库非归档,结果导致有两个文件scn不一致,无法打开库,结果他们选择offline文件,然后打开数据库

Wed Dec 04 14:06:04 2024
alter database open
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_6056.trc:
ORA-01113: 文件 10 需要介质恢复
ORA-01110: 数据文件 10: 'C:\PROGRAM FILES\ORACLE\XFF1.DBF'
ORA-1113 signalled during: alter database open...
Wed Dec 04 14:08:18 2024
alter database datafile 'c:\program files\oracle\XFF1.dbf' offline drop
Completed: alter database datafile 'c:\program files\oracle\XFF1.dbf' offline drop
Wed Dec 04 14:08:31 2024
alter database open
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_6056.trc:
ORA-01113: 文件 26 需要介质恢复
ORA-01110: 数据文件 26: 'C:\PROGRAM FILES\ORACLE\XFF2.DBF'
ORA-1113 signalled during: alter database open...
Wed Dec 04 14:08:31 2024
Checker run found 1 new persistent data failures
Wed Dec 04 14:08:51 2024
alter database datafile 'c:\program files\oracle\XFF2.dbf' offline drop
Completed: alter database datafile 'c:\program files\oracle\XFF2.dbf' offline drop
alter database open
Wed Dec 04 14:08:57 2024
Thread 1 opened at log sequence 136210
  Current log# 1 seq# 136210 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Dec 04 14:08:57 2024
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 2.
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Wed Dec 04 14:08:59 2024
QMNC started with pid=20, OS id=4264 
Completed: alter database open

后面自行尝试recover 数据文件没有成功

Wed Dec 04 14:42:50 2024
ALTER DATABASE RECOVER  datafile 26  
Media Recovery Start
Serial Media Recovery started
ORA-279 signalled during: ALTER DATABASE RECOVER  datafile 26  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Media Recovery Log D:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2024_12_04\O1_MF_1_135983_%U_.ARC
Errors with log D:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2024_12_04\O1_MF_1_135983_%U_.ARC
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER CANCEL 
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL 

由于这两个文件处于offline状态导致客户很多操作报ORA-00376 ORA-01110之类错

ORA-00376: file 10 cannot be read at this time
ORA-01110: data file 10: 'C:\PROGRAM FILES\ORACLE\XFF1.DBF'

对于这类故障使用Oracle Recovery Tools工具,一键恢复
225133


然后直接recover 数据文件成功
QQ20241207-185503

对于这类缺少归档数据文件offline的故障Oracle Recovery Tools可以快速傻瓜式恢复
软件下载:OraRecovery下载
使用说明:使用说明

dd破坏asm磁盘头恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:dd破坏asm磁盘头恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友对asm disk的磁盘头dd了2048byte的数据
dd-2048
asm-candidate
QQ20241202-204931


通过分析,gi软件版本,确认是11.2.0.4

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options.
ORACLE_HOME = /u01/app/11.2.0/grid
System name:	Linux
Node name:	rac1
Release:	4.1.12-37.4.1.el6uek.x86_64
Version:	#2 SMP Tue May 17 07:23:38 PDT 2016
Machine:	x86_64

从10.2.0.5之后版本,在第二个au的倒数第二个block上面,有asm disk header备份(每个block大小为4k),分析au大小(通过分析正常的asm disk快速找到au 大小【使用dd备份的正常的磁盘头查看】)

H:\TEMP\tmp\asmbak>kfed read sdcp.dd |grep ausize
kfdhdb.ausize:                 16777216 ; 0x0bc: 0x01000000

找到被破坏的asm disk的备份磁盘头信息

H:\TEMP\tmp\asmbak>kfed read sdc.dd blkn=4094 aun=1 aus=16777216|more
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                    4094 ; 0x004: blk=4094
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                   229348702 ; 0x00c: 0x0dab955e
kfbh.fcn.base:                 11727032 ; 0x010: 0x00b2f0b8
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:         ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]:            0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]:            0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:               DATA_0000 ; 0x028: length=9
kfdhdb.grpname:                    DATA ; 0x048: length=4
kfdhdb.fgname:                DATA_0000 ; 0x068: length=9
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.crestmp.hi:             33123276 ; 0x0a8: HOUR=0xc DAYS=0x1e MNTH=0xa YEAR=0x7e5
kfdhdb.crestmp.lo:           2259134464 ; 0x0ac: USEC=0x0 MSEC=0x1ea SECS=0x2a MINS=0x21
kfdhdb.mntstmp.hi:             33162836 ; 0x0b0: HOUR=0x14 DAYS=0x12 MNTH=0x1 YEAR=0x7e8
kfdhdb.mntstmp.lo:           3600987136 ; 0x0b4: USEC=0x0 MSEC=0xad SECS=0x2a MINS=0x35
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200
kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000
kfdhdb.ausize:                 16777216 ; 0x0bc: 0x01000000
kfdhdb.mfact:                    454272 ; 0x0c0: 0x0006ee80
kfdhdb.dsksize:                   65536 ; 0x0c4: 0x00010000
kfdhdb.pmcnt:                         2 ; 0x0c8: 0x00000002
kfdhdb.fstlocn:                       1 ; 0x0cc: 0x00000001
kfdhdb.altlocn:                       2 ; 0x0d0: 0x00000002
kfdhdb.f1b1locn:                      0 ; 0x0d4: 0x00000000
kfdhdb.redomirrors[0]:                0 ; 0x0d8: 0x0000
kfdhdb.redomirrors[1]:                0 ; 0x0da: 0x0000
kfdhdb.redomirrors[2]:                0 ; 0x0dc: 0x0000
…………

确认被损坏的磁盘只有磁盘头信息损坏(即确认第二个block是否是好的)

H:\TEMP\tmp\asmbak>kfed read sdc.dd blkn=0
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
0065D8400 00000000 00000000 00000000 00000000  [................]
  Repeat 255 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]


H:\TEMP\tmp\asmbak>kfed read sdc.dd blkn=1|more
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            2 ; 0x002: KFBTYP_FREESPC
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                       1 ; 0x004: blk=1
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  2781697777 ; 0x00c: 0xa5cd56f1
kfbh.fcn.base:                 39359331 ; 0x010: 0x02589363
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdfsb.aunum:                         0 ; 0x000: 0x00000000
kfdfsb.max:                        1014 ; 0x004: 0x03f6
kfdfsb.cnt:                         147 ; 0x006: 0x0093
kfdfsb.bound:                         0 ; 0x008: 0x0000
kfdfsb.flag:                          1 ; 0x00a: B=1
kfdfsb.ub1spare:                      0 ; 0x00b: 0x00
kfdfsb.spare[0]:                      0 ; 0x00c: 0x00000000
kfdfsb.spare[1]:                      0 ; 0x010: 0x00000000
kfdfsb.spare[2]:                      0 ; 0x014: 0x00000000
kfdfse[0].fse:                        0 ; 0x018: FREE=0x0 FRAG=0x0
…………

基于上述分析,直接使用备份的asm disk header信息进行merge或者repair修复之后,asm 磁盘头状态恢复正常
QQ20241202-205116
QQ20241202-205235
QQ20241202-205147


这个客户运气比较好,库非常大,只是破坏了2k的数据,如果超过4k可能就是比较麻烦的事故了,再次提醒对asm磁盘的dd操作一定要小心谨慎.如果不慎破坏asm磁盘过多,参考以前类似文档:
asm磁盘dd破坏恢复