tab$被恶意删除sys用户之外记录

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:tab$被恶意删除sys用户之外记录

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库open成功,但是alert日志报大量ORA-00600错误

Sun Apr 14 14:30:46 2019
SMCO started with pid=53, OS id=6761
Completed: ALTER DATABASE OPEN /* db agent *//* {1:65047:2} */
Sun Apr 14 14:30:49 2019
Starting background process CJQ0
Sun Apr 14 14:30:49 2019
CJQ0 started with pid=54, OS id=6776
Setting Resource Manager plan SCHEDULER[0x32DF]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Sun Apr 14 14:30:52 2019
Starting background process VKRM
Sun Apr 14 14:30:52 2019
VKRM started with pid=37, OS id=6809
Sun Apr 14 14:30:54 2019
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_j000_6811.trc  (incident=288633):
ORA-00600: internal error code, arguments: [kkpo_rcinfo_defstg:delseg], [84638], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_288633/xifenfei1_j000_6811_i288633.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_j000_6811.trc:
ORA-00600: internal error code, arguments: [kkpo_rcinfo_defstg:delseg], [84638], [], [], [], [], [], [], [], [], [], []
ORA-06512: at "APEX_030200.WWV_FLOW_MAIL", line 695
ORA-06512: at line 1
Sun Apr 14 14:30:57 2019
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_j000_7491.trc  (incident=288658):
ORA-00600: 内部错误代码, 参数: [16659], [kqldtu], [INS], [0], [206196], [], [], [], [], [], [], []
Incident details in: /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_288658/xifenfei1_j000_7491_i288658.trc
Sun Apr 14 14:34:10 2019
Dumping diagnostic data in directory=[cdmp_20190414143410], requested by (instance=1, osid=7491 (J000)), summary=[incident=288658].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_j000_7491.trc:
ORA-00600: 内部错误代码, 参数: [16659], [kqldtu], [INS], [0], [206196], [], [], [], [], [], [], []
ORA-06512: 在 "WEBCSMS.P_YGERROR", line 3
Sun Apr 14 14:39:08 2019
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_j000_8515.trc  (incident=288593):
ORA-00600: 内部错误代码, 参数: [kdfReserveSingle_1], [0], [65280], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_288593/xifenfei1_j000_8515_i288593.trc
ORA-06512: 在 line 1
Sun Apr 14 14:52:14 2019
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_j001_11062.trc  (incident=288705):
ORA-00600: internal error code, arguments: [16607], [0x3CFB04C90], [257], [9], [0x000000000], [], [], [], [], [], [], []
Incident details in: /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_288705/xifenfei1_j001_11062_i288705.trc
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_16712.trc  (incident=288597):
ORA-00600: 内部错误代码, 参数: [16607], [0x3C7CEA678], [1281], [9], [0x000000000], [], [], [], [], [], [], []
Incident details in: /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_288597/xifenfei1_ora_16712_i288597.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.

报错比较多,客户还反馈登录数据库之后,发现所有的表都丢失。第一反应可能数据字典损坏了,然后让客户查看备库,现在dg的备库也一样表都丢失了,进一步确认字典可能异常,让客户提供system文件进行本地分析.发现DBMS_SUPPORT_DBMONITOR触发器调用DBMS_SUPPORT_DBMONITORP存储过程,和警告:互联网中有oracle介质被注入恶意程序导致—ORA-600 16703中的名称非常类似,但是有点不一样,以前的恶意脚本中都是被注入并且触发之后,数据库无法正常启动,这次数据库能够正常open成功.分析恶意脚本,确认原因
1
2
3
确实这次的恶意脚本是在2016年8月份被创建在库中,在600天之后重启被触发,而且是删除非sys的tab$中记录.知道了恶意脚本的源头,那恢复就比较容易,直接通过批量bbed程序对tab$反删除可以实现比较完美恢复.原则上这样的故障可以实现数据库完美恢复,原库继续使用.

硬件故障数据库异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:硬件故障数据库异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

硬件故障数据库crash
有客户由于硬件故障导致数据库异常ORA-00345 ORA-00312 ORA-27070 OSD-04016

Tue Feb 05 16:58:26 2019
Thread 1 advanced to log sequence 17139 (LGWR switch)
  Current log# 12 seq# 17139 mem# 0: S:\ORADATA\ORCL\REDO12A.LOG
  Current log# 12 seq# 17139 mem# 1: S:\ORADATA\ORCL\REDO12B.LOG
Tue Feb 05 19:47:24 2019
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_lgwr_2420.trc:
ORA-00345: redo log write error block 152097 count 8
ORA-00312: online log 12 thread 1: 'S:\ORADATA\ORCL\REDO12A.LOG'
ORA-27070: async read/write failed
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
ORA-00345: redo log write error block 152097 count 8
ORA-00312: online log 12 thread 1: 'S:\ORADATA\ORCL\REDO12B.LOG'
ORA-27070: async read/write failed
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
ORA-00345: redo log write error block 152105 count 1
ORA-00312: online log 12 thread 1: 'S:\ORADATA\ORCL\REDO12A.LOG'
ORA-27070: async read/write failed
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。

直接启动数据库报错
修复好硬件之后,直接启动数据库报ORA-00600 kcratr_scan_lastbwr错误

Fri Feb 08 20:58:15 2019
alter database mount exclusive
Successful mount of redo thread 1, with mount id 1527506791
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Beginning crash recovery of 1 threads
Started redo scan
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3672.trc  (incident=41353):
ORA-00600: ??????, ??: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Incident details in: c:\oracle\diag\rdbms\orcl\orcl\incident\incdir_41353\orcl_ora_3672_i41353.trc
Aborting crash recovery due to error 600
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3672.trc:
ORA-00600: ??????, ??: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3672.trc:
ORA-00600: ??????, ??: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: alter database open...
Fri Feb 08 20:58:24 2019
Trace dumping is performing id=[cdmp_20190208205824]
Fri Feb 08 20:59:04 2019
alter database open
Beginning crash recovery of 1 threads
Started redo scan
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_1696.trc  (incident=41354):
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Incident details in: c:\oracle\diag\rdbms\orcl\orcl\incident\incdir_41354\orcl_ora_1696_i41354.trc
Aborting crash recovery due to error 600
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_1696.trc:
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_1696.trc:
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: alter database open ...

recover database报错
执行recover database报错ORA-00600 6101,ORA-00600 kdourp_inorder2,ORA-00600 ktbsdp1,ORA-00600 3020

Fri Feb 08 21:09:20 2019
ALTER DATABASE RECOVER  database
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 4 slaves
Fri Feb 08 21:09:21 2019
Recovery of Online Redo Log: Thread 1 Group 12 Seq 17139 Reading mem 0
  Mem# 0: S:\ORADATA\ORCL\REDO12A.LOG
  Mem# 1: S:\ORADATA\ORCL\REDO12B.LOG
Fri Feb 08 21:09:21 2019
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr02_3780.trc  (incident=49379):
ORA-00600: internal error code, arguments: [6101], [17], [21], [0], [], [], [], [], [], [], [], []
Incident details in: c:\oracle\diag\rdbms\orcl\orcl\incident\incdir_49379\orcl_pr02_3780_i49379.trc
Fri Feb 08 21:09:21 2019
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr01_2040.trc  (incident=49371):
ORA-00600: internal error code, arguments: [kdourp_inorder2], [34], [0], [0], [44], [], [], [], [], [], [], []
Incident details in: c:\oracle\diag\rdbms\orcl\orcl\incident\incdir_49371\orcl_pr01_2040_i49371.trc
Fri Feb 08 21:09:21 2019
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr03_1068.trc  (incident=49387):
ORA-00600: internal error code, arguments: [ktbsdp1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: c:\oracle\diag\rdbms\orcl\orcl\incident\incdir_49387\orcl_pr03_1068_i49387.trc
Fri Feb 08 21:09:24 2019
Trace dumping is performing id=[cdmp_20190208210924]
Slave exiting with ORA-10562 exception
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr03_1068.trc:
ORA-10562: Error occurred while applying redo to data block (file# 4, block# 1716972)
ORA-10564: tablespace USERS
ORA-01110: data file 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 204127
ORA-00600: internal error code, arguments: [ktbsdp1], [], [], [], [], [], [], [], [], [], [], []
Slave exiting with ORA-10562 exception
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr02_3780.trc:
ORA-10562: Error occurred while applying redo to data block (file# 4, block# 1738552)
ORA-10564: tablespace USERS
ORA-01110: data file 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 211606
ORA-00600: internal error code, arguments: [6101], [17], [21], [0], [], [], [], [], [], [], [], []
Slave exiting with ORA-10562 exception
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr01_2040.trc:
ORA-10562: Error occurred while applying redo to data block (file# 4, block# 1725898)
ORA-10564: tablespace USERS
ORA-01110: data file 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 73907
ORA-00600: internal error code, arguments: [kdourp_inorder2], [34], [0], [0], [44], [], [], [], [], [], [], []
Recovery Slave PR03 previously exited with exception 10562
Fri Feb 08 21:09:28 2019
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr04_2608.trc  (incident=49395):
ORA-00600: internal error code, arguments: [3020], [4], [1739291], [18516507], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 4, block# 1739291, file offset is 1363369984 bytes)
ORA-10564: tablespace USERS
ORA-01110: data file 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 211552
Incident details in: c:\oracle\diag\rdbms\orcl\orcl\incident\incdir_49395\orcl_pr04_2608_i49395.trc
Slave exiting with ORA-600 exception
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr04_2608.trc:
ORA-00600: internal error code, arguments: [3020], [4], [1739291], [18516507], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 4, block# 1739291, file offset is 1363369984 bytes)
ORA-10564: tablespace USERS
ORA-01110: data file 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 211552
Media Recovery failed with error 448
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_1548.trc:
ORA-00283: recovery session canceled due to errors
ORA-00448: normal completion of background process
Slave exiting with ORA-283 exception
Errors in file c:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_1548.trc:
ORA-00283: recovery session canceled due to errors
ORA-00448: normal completion of background process
ORA-10562 signalled during: ALTER DATABASE RECOVER  database  ...

出现上述问题主要是由于硬件突然故障,数据写丢失导致相关问题.

处理思路

RMAN> recover datafile 1;
启动 recover 于 09-2月 -19
使用通道 ORA_DISK_1
正在开始介质的恢复
介质恢复完成, 用时: 00:00:01
完成 recover 于 09-2月 -19
RMAN> recover datafile 2;
启动 recover 于 09-2月 -19
使用通道 ORA_DISK_1
正在开始介质的恢复
介质恢复完成, 用时: 00:00:01
完成 recover 于 09-2月 -19
RMAN> recover datafile 3;
启动 recover 于 09-2月 -19
使用通道 ORA_DISK_1
正在开始介质的恢复
介质恢复完成, 用时: 00:00:02
完成 recover 于 09-2月 -19
RMAN> recover datafile 4;
启动 recover 于 09-2月 -19
使用通道 ORA_DISK_1
正在开始介质的恢复
无法恢复介质
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: recover 命令 (在 02/09/2019 21:48:19 上) 失败
ORA-00283: recovery session canceled due to errors
RMAN-11003: 在分析/执行 SQL 语句期间失败: alter database recover if needed
 datafile 4
ORA-00283: 恢复会话因错误而取消
ORA-10562: Error occurred while applying redo to data block (file# 4, block# 172
5913)
ORA-10564: tablespace USERS
ORA-01110: 数据文件 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 73907
ORA-00600: 内部错误代码, 参数: [kdourp_inorder2], [34], [43], [44], [44], [], []
, [], [], [], [], []
SQL> recover datafile 4;
ORA-00283: 恢复会话因错误而取消
ORA-10562: Error occurred while applying redo to data block (file# 4, block#
1725913)
ORA-10564: tablespace USERS
ORA-01110: 数据文件 4: 'S:\ORADATA\ORCL\USERS01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 73907
ORA-00600: 内部错误代码, 参数: [kdourp_inorder2], [34], [43], [44], [44], [],
[], [], [], [], [], []
--通过bbed修改异常文件,屏蔽文件恢复,直接open库
SQL> alter database open;
数据库已更改。

数据库open之后,逻辑方式导出数据,重建新库,导入数据.

ORA-600 16513故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600 16513故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到一个朋友恢复请求,由于硬件故障,通过底层技术恢复数据文件之后,数据库启动报错,但是dbv检查没有发现坏块
ORA-600 16513报错

[oracle@ora11g ~]$ ss
SQL*Plus: Release 11.2.0.4.0 Production on Fri Jan 25 15:06:15 2019
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> startup
ORACLE instance started.
Total System Global Area 3056513024 bytes
Fixed Size                  2257152 bytes
Variable Size             704646912 bytes
Database Buffers         2332033024 bytes
Redo Buffers               17575936 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [16513], [1403], [20], [], [], [],
[], [], [], [], [], []
Process ID: 20487
Session ID: 191 Serial number: 3

数据库alert日志报错

Fri Jan 25 14:15:18 2019
ALTER DATABASE OPEN
Thread 1 opened at log sequence 11
  Current log# 2 seq# 11 mem# 0: /u01/app/oracle/oradata/orcl11g/redo02.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/orcl11g/orcl11g/trace/orcl11g_ora_12712.trc  (incident=4953):
ORA-00600: internal error code, arguments: [16513], [1403], [20], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl11g/orcl11g/incident/incdir_4953/orcl11g_ora_12712_i4953.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/orcl11g/orcl11g/trace/orcl11g_ora_12712.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [16513], [1403], [20], [], [], [], [], [], [], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/orcl11g/orcl11g/trace/orcl11g_ora_12712.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [16513], [1403], [20], [], [], [], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 12712): terminating the instance due to error 704
Instance terminated by USER, pid = 12712
ORA-1092 signalled during: ALTER DATABASE OPEN...
opiodr aborting process unknown ospid (12712) as a result of ORA-1092

trace文件报错

---orcl11g_ora_12712.trc文件报错
ORA-00600: internal error code, arguments: [16513], [1403], [20], [], [], [], [], [], [], [], [], []
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [16513], [1403], [20], [], [], [], [], [], [], [], [], []
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [16513], [1403], [20], [], [], [], [], [], [], [], [], []
*** 2019-01-25 14:15:20.875
USER (ospid: 12712): terminating the instance due to error 704
---orcl11g_ora_12712_i4953.trc文件报错
Dump continued from file: /u01/app/oracle/diag/rdbms/orcl11g/orcl11g/trace/orcl11g_ora_12712.trc
ORA-00600: internal error code, arguments: [16513], [1403], [20], [], [], [], [], [], [], [], [], []
========= Dump for incident 4953 (ORA 600 [16513]) ========
*** 2019-01-25 14:15:19.247
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=1h50ks4ncswfn) -----
ALTER DATABASE OPEN
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+41        call     kgdsdst()            000000000 ? 000000000 ?
                                                   7FFD7A0B37B0 ? 7FFD7A0B3888 ?
                                                   7FFD7A0B8330 ? 000000002 ?
ksedst1()+103        call     skdstdst()           000000000 ? 000000000 ?
                                                   7FFD7A0B37B0 ? 7FFD7A0B3888 ?
                                                   7FFD7A0B8330 ? 000000002 ?
ksedst()+39          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FFD7A0B37B0 ? 7FFD7A0B3888 ?
                                                   7FFD7A0B8330 ? 000000002 ?
dbkedDefDump()+2746  call     ksedst()             000000000 ? 000000001 ?
                                                   7FFD7A0B37B0 ? 7FFD7A0B3888 ?
                                                   7FFD7A0B8330 ? 000000002 ?
ksedmp()+41          call     dbkedDefDump()       000000003 ? 000000002 ?
                                                   7FFD7A0B37B0 ? 7FFD7A0B3888 ?
                                                   7FFD7A0B8330 ? 000000002 ?
ksfdmp()+69          call     ksedmp()             000000003 ? 000000002 ?
                                                   7FFD7A0B37B0 ? 7FFD7A0B3888 ?
                                                   7FFD7A0B8330 ? 000000002 ?
dbgexPhaseII()+1764  call     ksfdmp()             000000003 ? 000000002 ?
                                                   7FFD7A0B37B0 ? 7FFD7A0B3888 ?
                                                   7FFD7A0B8330 ? 000000002 ?
dbgexProcessError()  call     dbgexPhaseII()       7F06FBCD2730 ? 7F06FBCD5520 ?
+2680                                              7FFD7A0BCE08 ? 7FFD7A0B3888 ?
                                                   7FFD7A0B8330 ? 000000002 ?
dbgeExecuteForError  call     dbgexProcessError()  7F06FBCD2730 ? 7F06FBCD5520 ?
()+88                                              000000001 ? 000000000 ?
                                                   7FFD7A0B8330 ? 000000002 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  7F06FBCD2730 ? 7F06FBCD5520 ?
2136                          ()                   000000001 ? 000000001 ?
                                                   000000000 ? 000000002 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   00C114E60 ? 7F06FB7B0040 ?
71                                                 000000258 ? 000000001 ?
                                                   000000000 ? 000000002 ?
kgeade()+351         call     dbkePostKGE_kgsf()   00C114E60 ? 7F06FB7B0040 ?
                                                   000000258 ? 000000001 ?
                                                   000000000 ? 000000002 ?
kgeriv_int()+125     call     kgeade()             00C114E60 ? 00C115020 ?
                                                   7F06FB7B0040 ? 000000258 ?
                                                   000000000 ? 000000002 ?
kgeriv()+17          call     kgeriv_int()         00C114E60 ? 00C115020 ?
                                                   7F06FB7B0040 ? 000000258 ?
                                                   000000000 ? 000000002 ?
kgesiv()+115         call     kgeriv()             00C114E60 ? 00C115020 ?
                                                   7F06FB7B0040 ? 000000258 ?
                                                   000000000 ? 000000002 ?
ksesic2()+199        call     kgesiv()             00C114E60 ? 7F06FB7B0040 ?
                                                   000004081 ? 000000002 ?
                                                   7FFD7A0BDAC0 ? 000000002 ?
kqdpts()+522         call     ksesic2()            00C114E60 ? 000000000 ?
                                                   00000057B ? 000000000 ?
                                                   00000003B ? 000008000 ?
kqrlfc()+685         call     kqdpts()             1158C5BC0 ? 000000000 ?
                                                   00000057B ? 000000000 ?
                                                   00000003B ? 000008000 ?
kqlbplc()+180        call     kqrlfc()             1158C5BC0 ? 7FFD7A0BDEC0 ?
                                                   00000057B ? 000000000 ?
                                                   00000003B ? 000008000 ?
kqlblfc()+280        call     kqlbplc()            000000000 ? 7FFD7A0BDEC0 ?
                                                   00000057B ? 000000000 ?
                                                   00000003B ? 000008000 ?
adbdrv()+57408       call     kqlblfc()            000000000 ? 7FFD7A0C4BD4 ?
                                                   00000057B ? 000000000 ?
                                                   00000003B ? 000008000 ?
opiexe()+18724       call     adbdrv()             000000000 ? 7FFD7A0C4BD4 ?
                                                   00000057B ? 000000000 ?
                                                   00000003B ? 000008000 ?
opiosq0()+4303       call     opiexe()             000000004 ? 000000000 ?
                                                   00000057B ? 000000000 ?
                                                   00000003B ? 000008000 ?
kpooprx()+274        call     opiosq0()            000000003 ? 00000000E ?
                                                   7FFD7A0C65B0 ? 0000000A4 ?
                                                   00000003B ? 000008000 ?
kpoal8()+842         call     kpooprx()            7FFD7A0C9D94 ? 7FFD7A0C7DB8 ?
                                                   000000013 ? 000000001 ?
                                                   000000000 ? 000008000 ?
opiodr()+917         call     kpoal8()             00000005E ? 7FFD7A0C7DB8 ?
                                                   000000013 ? 000000001 ?
                                                   000000000 ? 000008000 ?
ttcpip()+2183        call     opiodr()             00000005E ? 00000001C ?
                                                   7FFD7A0C9D90 ? 000000001 ?
                                                   000000000 ? 000008000 ?
opitsk()+1710        call     ttcpip()             00C132AB0 ? 0099D6610 ?
                                                   7FFD7A0C9D90 ? 000000000 ?
                                                   7FFD7A0C97E8 ? 7FFD7A0C9D8C ?
opiino()+969         call     opitsk()             00C132AB8 ? 000000000 ?
                                                   7FFD7A0C9D90 ? 000000000 ?
                                                   7FFD7A0C97E8 ? 7FFD7A0C9D8C ?
opiodr()+917         call     opiino()             00000003C ? 000000004 ?
                                                   7FFD7A0CB588 ? 000000000 ?
                                                   7FFD7A0C97E8 ? 7FFD7A0C9D8C ?
opidrv()+570         call     opiodr()             00000003C ? 000000004 ?
                                                   7FFD7A0CB588 ? 000000000 ?
                                                   7FFD7A0C97E8 ? 7FFD7A0C9D8C ?
sou2o()+103          call     opidrv()             00000003C ? 000000004 ?
                                                   7FFD7A0CB588 ? 000000000 ?
                                                   7FFD7A0C97E8 ? 7FFD7A0C9D8C ?
opimai_real()+133    call     sou2o()              7FFD7A0CB560 ? 00000003C ?
                                                   000000004 ? 7FFD7A0CB588 ?
                                                   7FFD7A0C97E8 ? 7FFD7A0C9D8C ?
ssthrdmain()+265     call     opimai_real()        000000002 ? 7FFD7A0CB750 ?
                                                   000000004 ? 7FFD7A0CB588 ?
                                                   7FFD7A0C97E8 ? 7FFD7A0C9D8C ?
main()+201           call     ssthrdmain()         000000002 ? 7FFD7A0CB750 ?
                                                   000000001 ? 000000000 ?
                                                   7FFD7A0C97E8 ? 7FFD7A0C9D8C ?
__libc_start_main()  call     main()               000000002 ? 7FFD7A0CB8F8 ?
+253                                               000000001 ? 000000000 ?
                                                   7FFD7A0C97E8 ? 7FFD7A0C9D8C ?
_start()+41          call     __libc_start_main()  000A2A5B4 ? 000000002 ?
                                                   7FFD7A0CB8E8 ? 000000000 ?
                                                   7FFD7A0C97E8 ? 7FFD7A0C9D8C ?
--------------------- Binary Stack Dump ---------------------

根据错误提示含ORA-00704错误,可以初步确定很可能是在数据库核心基表有问题导致该问题.trace数据库启动过程,发现在obj$中异常,通过跟踪定位到启动报错所在位置,然后通过bbed(参考:bbed 文章汇总)进行修改,再次尝试启动数据库
ORA-704 ORA-1555

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 10 with name
"_SYSSMU10_1197734989$" too small
Process ID: 17898
Session ID: 191 Serial number: 5

该错误比较熟悉,通过bbed继续修改相关信息(参考:在数据库open过程中常遇到ORA-01555汇总)启动数据库成功

SQL> startup mount;
ORACLE instance started.
Total System Global Area 3056513024 bytes
Fixed Size                  2257152 bytes
Variable Size             704646912 bytes
Database Buffers         2332033024 bytes
Redo Buffers               17575936 bytes
Database mounted.
SQL> alter database open;
Database altered.

尝试ddl操作报错误
ORA-600 kkdlcob-objn-exists

SQL>  create table t1 as select * from dual;
 create table t1 as select * from dual
                                  *
ERROR at line 1:
ORA-00600: internal error code, arguments: [kkdlcob-objn-exists], [87520], [],
[], [], [], [], [], [], [], [], []

该错误是由于字典obj$的dataobj#太小导致,修复该问题之后,有出现以下问题
ORA-8102

SQL> create table t1 as select * from dual;
create table t1 as select * from dual
                                 *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 54781 (2)

这个错误比较明显由于obj$上的index异常导致,通过文章:bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决修复之后一切恢复正常

SQL> CREATE TABLE T1 AS SELECT * FROM DUAL;
Table created.
SQL> DROP TABLE T1 PURGE;
Table dropped.

至此数据库恢复基本完成,建议客户逻辑方式导出数据,导入到新库

tab$恢复错误汇总

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:tab$恢复错误汇总

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在以前多以前发现的tab$被恶意脚本删除的问题(ORA-600 16703故障解析—tab$表被清空,警告:互联网中有oracle介质被注入恶意程序导致—ORA-600 16703),虽然多次强调注意Oracle安装介质安全,但是很不幸,还是大量客户中招.我们这一年多对于tab$的故障进行了大量case处理,拯救了大量客户的核心数据,也积累了一些常见的可能遭遇的错误.主要恢复思路是使用bbed处理异常block,让数据库open起来.
ORA-00704 ORA-39700
有核心基表处理异常导致

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-39700: database must be opened with UPGRADE option
Process ID: 1603
Session ID: 1 Serial number: 5
Sun Jan 06 21:30:14 2019
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_1603.trc:
ORA-00704: bootstrap process failure
ORA-39700: database must be opened with UPGRADE option
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_1603.trc:
ORA-00704: bootstrap process failure
ORA-39700: database must be opened with UPGRADE option
Error 704 happened during db open, shutting down database
USER (ospid: 1603): terminating the instance due to error 704
Instance terminated by USER, pid = 1603
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (1603) as a result of ORA-1092
Sun Jan 06 21:30:14 2019
ORA-1092 : opitsk aborting process

ora-704 ora-604 ora-01555
由于scn异常导致

SQL> alter database open upgrade;
alter database open upgrade
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 2 with name
"_SYSSMU2_2996391332$" too small
Process ID: 26520
Session ID: 1 Serial number: 5
Sun Jan 06 19:49:12 2019
SMON: enabling cache recovery
ORA-01555 caused by SQL statement below (SQL ID: bqbdby3c400p7, SCN: 0x0022.1117ef75):
select rowcnt,blkcnt,empcnt,avgspc,chncnt,avgrln,nvl(degree,1), nvl(instances,1) from tab$ where obj# = :1
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_26520.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 2 with name "_SYSSMU2_2996391332$" too small
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_26520.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 2 with name "_SYSSMU2_2996391332$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 26520): terminating the instance due to error 704
Instance terminated by USER, pid = 26520
ORA-1092 signalled during: alter database open upgrade...
opiodr aborting process unknown ospid (26520) as a result of ORA-1092

ORA-600 13304
有核心基表处理异常导致

SQL> startup mount;
ORACLE instance started.
Total System Global Area  521936896 bytes
Fixed Size                  2254824 bytes
Variable Size             352323608 bytes
Database Buffers          163577856 bytes
Redo Buffers                3780608 bytes
Database mounted.
SQL> alter database open upgrade;
alter database open upgrade
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [13304], [], [], [], [], [], [], [],
[], [], [], []
Process ID: 1724
Session ID: 1 Serial number: 5
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sun Jan 06 21:31:04 2019
SMON: enabling cache recovery
[1724] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:2239884804 end:2239884864 diff:60 (0 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_1724.trc  (incident=61755):
ORA-00600: internal error code, arguments: [13304], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_61755/orcl_ora_1724_i61755.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_1724.trc:
ORA-00600: internal error code, arguments: [13304], [], [], [], [], [], [], [], [], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_1724.trc:
ORA-00600: internal error code, arguments: [13304], [], [], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 1724): terminating the instance due to error 600
Instance terminated by USER, pid = 1724
ORA-1092 signalled during: alter database open upgrade...
opiodr aborting process unknown ospid (1724) as a result of ORA-1092
Sun Jan 06 21:31:06 2019
ORA-1092 : opitsk aborting process

ORA-00704 ORA-600 kdBlkCheckError
恢复的block有逻辑坏块

SQL> startup
ORACLE instance started.
Total System Global Area 3056513024 bytes
Fixed Size                  2257152 bytes
Variable Size             704646912 bytes
Database Buffers         2332033024 bytes
Redo Buffers               17575936 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [1], [31497],
[6121], [], [], [], [], [], [], [], []
Process ID: 76932
Session ID: 191 Serial number: 3
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_76932.trc  (incident=6153):
ORA-00600: internal error code, arguments: [kdBlkCheckError], [1], [31497], [6121], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/incident/incdir_6153/xifenfei_ora_76932_i6153.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_76932.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [1], [31497], [6121], [], [], [], [], [], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_76932.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [1], [31497], [6121], [], [], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 76932): terminating the instance due to error 704
Instance terminated by USER, pid = 76932
ORA-1092 signalled during: ALTER DATABASE OPEN...
opiodr aborting process unknown ospid (76932) as a result of ORA-1092
Sat Feb 22 11:04:19 2014
ORA-1092 : opitsk aborting process

10g数据库遭遇ORA-600 16703

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:10g数据库遭遇ORA-600 16703

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

ORA-600 16703
有客户反馈10g数据库启动报ORA-00704 ORA-00600,具体如下

Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bi
PL/SQL Release 10.2.0.3.0 - Production
CORE	10.2.0.3.0	Production
TNS for IBM/AIX RISC System/6000: Version 10.2.0.3.0 - Productio
NLSRTL Version 10.2.0.3.0 - Production
Fri Jan  4 10:11:05 2019
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Jan  4 10:11:05 2019
SMON: enabling cache recovery
Fri Jan  4 10:11:05 2019
Errors in file /oracle/admin/xifenfei/udump/xifenfei1_ora_1826952.trc:
ORA-00600: internal error code, arguments: [16703], [1403], [28], [], [], [], [], []
Fri Jan  4 10:11:06 2019
Errors in file /oracle/admin/xifenfei/udump/xifenfei1_ora_1826952.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [16703], [1403], [28], [], [], [], [], []
Fri Jan  4 10:11:06 2019
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1826952
ORA-1092 signalled during: alter database open...

看到这个错误第一想到以前的ORA-00600: internal error code, arguments: [16703], [1403], [20], [], [], [], [], [], [], [], [], [],但是有几个地方不同:1)根据以往恢复经验这个错误都出现在11.2.0.4版本数据库中;2)以往的ORA-600 16703错误最后值是20,而这次是28

分析数据库故障过程
节点一因为其他错误重启,重启之后报大量错误,然后abort掉.

--节点1
Fri Jan  4 06:11:00 2019
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=29, OS id=2883786
Fri Jan  4 06:11:07 2019
SMON: Parallel transaction recovery tried
Fri Jan  4 06:11:07 2019
Errors in file /oracle/admin/xifenfei/udump/xifenfei1_ora_2769106.trc:
ORA-00600: internal error code, arguments: [16607], [0x70000020E4E7258], [257], [0], [], [], [], []
Fri Jan  4 06:11:09 2019
ORA-600 signalled during: ALTER DATABASE OPEN...
Fri Jan  4 06:11:09 2019
Trace dumping is performing id=[cdmp_20190104061109]
Fri Jan  4 06:11:30 2019
Errors in file /oracle/admin/xifenfei/udump/xifenfei1_ora_2240550.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-20001: the IP can not logon
ORA-06512: at line 36
Fri Jan  4 06:12:35 2019
Errors in file /oracle/admin/xifenfei/udump/xifenfei1_ora_2793508.trc:
ORA-00600: internal error code, arguments: [16659], [kqldtu], [D], [0], [65], [], [], []
Fri Jan  4 06:12:40 2019
Trace dumping is performing id=[cdmp_20190104061240]
Fri Jan  4 06:16:10 2019
Errors in file /oracle/admin/xifenfei/udump/xifenfei1_ora_1933540.trc:
ORA-00600: internal error code, arguments: [16607], [0x70000020D957260], [1793], [0], [], [], [], []
Fri Jan  4 06:16:10 2019
Errors in file /oracle/admin/xifenfei/udump/xifenfei1_ora_2793546.trc:
ORA-00600: internal error code, arguments: [16607], [0x70000020D957260], [1793], [0], [], [], [], []
Fri Jan  4 06:16:11 2019
Trace dumping is performing id=[cdmp_20190104061611]
Fri Jan  4 06:16:15 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei1_smon_3055636.trc:
ORA-00600: internal error code, arguments: [insSetColumnInfo_1], [8], [8], [], [], [], [], []
Fri Jan  4 06:16:16 2019
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Fri Jan  4 06:17:28 2019
Errors in file /oracle/admin/xifenfei/udump/xifenfei1_ora_2035774.trc:
ORA-00600: internal error code, arguments: [16607], [0x70000020D957260], [1793], [0], [], [], [], []
…………
Fri Jan  4 06:56:47 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei1_smon_3055636.trc:
ORA-00600: internal error code, arguments: [insSetColumnInfo_1], [8], [8], [], [], [], [], []
Fri Jan  4 06:56:49 2019
Errors in file /oracle/admin/xifenfei/udump/xifenfei1_ora_3072030.trc:
ORA-07445: exception encountered: core dump [] [] [] [] [] []
ORA-06544: PL/SQL: internal error, arguments: [], [interpreter cannot interpret pcode], [], [], [], [], [], []
ORA-06544: PL/SQL: internal error, arguments: [], [interpreter cannot interpret pcode], [], [], [], [], [], []
Fri Jan  4 06:56:50 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei1_pmon_3092524.trc:
ORA-00474: SMON process terminated with error
Fri Jan  4 06:56:50 2019
PMON: terminating instance due to error 474
Fri Jan  4 06:56:50 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei1_lms0_3178590.trc:
ORA-00474: SMON process terminated with error
Fri Jan  4 06:56:50 2019
System state dump is made for local instance
System State dumped to trace file /oracle/admin/xifenfei/bdump/xifenfei1_diag_434192.trc
Fri Jan  4 06:56:51 2019
Shutting down instance (abort)

节点一重启触发故障之后,节点二也开始报错,然后数据库直接挂掉

--节点2
Fri Jan  4 06:21:19 2019
Trace dumping is performing id=[cdmp_20190104062118]
Fri Jan  4 06:21:22 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei2_j000_1253440.trc:
ORA-00600: internal error code, arguments: [kghGetHpSz1], [0x7000001DF886220], [], [], [], [], [], []
Fri Jan  4 06:21:24 2019
Errors in file /oracle/admin/xifenfei/udump/xifenfei2_ora_1835262.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-20001: the IP can not logon
ORA-06512: at line 36
Fri Jan  4 06:21:56 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei2_j002_1777888.trc:
ORA-07445: exception encountered: core dump [] [] [] [] [] []
Fri Jan  4 06:24:38 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei2_j003_1589340.trc:
ORA-00600: internal error code, arguments: [16659], [kqldtu], [D], [0], [97952], [], [], []
Fri Jan  4 06:24:44 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei2_j003_1589340.trc:
ORA-12012: error on auto execute of job 351
ORA-12008: error in materialized view refresh path
ORA-00600: internal error code, arguments: [16659], [kqldtu], [D], [0], [97952], [], [], []
Fri Jan  4 06:28:48 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei2_j001_1417456.trc:
ORA-00600: internal error code, arguments: [kkzufst], [18446744073709551615], [], [], [], [], [], []
Fri Jan  4 06:28:49 2019
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x1104B2580], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [insSetColumnInfo_1], [8], [8], [], [], [], [], []
Fri Jan  4 07:30:25 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei2_j002_2121800.trc:
ORA-00600: internal error code, arguments: [17147], [0x7000001D291D230], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17147], [0x7000001D291D230], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kggfaAllocFunc1], [], [], [], [], [], [], []
…………
Fri Jan  4 07:31:23 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei2_pmon_634956.trc:
ORA-00600: internal error code, arguments: [17147], [0x7000001D291D230], [], [], [], [], [], []
Fri Jan  4 07:31:24 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei2_pmon_634956.trc:
ORA-00600: internal error code, arguments: [17147], [0x7000001D291D230], [], [], [], [], [], []
Fri Jan  4 07:31:24 2019
PMON: terminating instance due to error 472
Fri Jan  4 07:31:24 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei2_lms0_569432.trc:
ORA-00472: PMON  process terminated with error

两个节点再次启动报ORA-600 16703错误

Fri Jan  4 07:31:50 2019
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Jan  4 07:31:50 2019
SMON: enabling cache recovery
Fri Jan  4 07:31:50 2019
Errors in file /oracle/admin/xifenfei/udump/xifenfei2_ora_3813516.trc:
ORA-00600: internal error code, arguments: [16703], [1403], [28], [], [], [], [], []
Fri Jan  4 07:31:52 2019
Errors in file /oracle/admin/xifenfei/udump/xifenfei2_ora_3813516.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [16703], [1403], [28], [], [], [], [], []
Fri Jan  4 07:31:52 2019
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Fri Jan  4 07:31:52 2019
Errors in file /oracle/admin/xifenfei/bdump/xifenfei2_lms0_3780764.trc:
ORA-00704: bootstrap process failure
Instance terminated by USER, pid = 3813516
ORA-1092 signalled during: ALTER DATABASE OPEN...

10046跟踪启动过程
ora-600-16703-10046


分析故障原因
通过分析,确认该数据库被注入了恶意脚本,当发生重启之后导致数据库核心基表被破坏,从而使得数据库无法正常启动

procedure     DBMS_DBMONITOR wrapped
a000000
369
abcd
abcd
abcd
abcd
abcd
abcd
abcd
abcd
abcd
abcd
abcd
abcd
abcd
abcd
abcd
7
15a 171
VKGnETw5GkgNlUvXadIYpQ2thwAwgzKur9zWZ47SR+pHN0bgoPvQ8ezdufYxFkzCmAQA8xT2
IQkFph+2IhlEBrPrXe14giDow/HzZI43FwLiMlynCYjnzQh3aXRSIVOalcwGAfUvgCip6Eng
OWA8Vq49YJ38WCPZjq2P5Dc428Wa1ZOPMb+E7GPs8ZOWM7RWsQdMzx/pqFncbX/tLwp0NY5E
Uu4E54MZ34yVtDQybwljVqp06KHqWN/ZwZJpvT+2gO4hRNX2UyE7laWCXzM2IR05BTGf2yoZ
+E7eIn6kciinFmhcUiBuszxE0pykt+moWZuuDuj9ebUXmj+0Mx7A+eQc6tp7wLHRCHYb6p0D
VtDc4f6

通过对损坏的字典进行恢复,实现数据0丢失
open-database


2019恢复第一单—ORA-704-ORA-702恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:2019恢复第一单—ORA-704-ORA-702恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

19年元旦刚过就有客户反馈数据库无法正常启动报ORA-00704和ORA-00702错误
ora-00702


Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_3756.trc:
ORA-00704: 引导程序进程失败
ORA-00702: 引导程序版本 '' 与版本 '8.0.0.0.0' 不一致
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_3756.trc:
ORA-00704: 引导程序进程失败
ORA-00702: 引导程序版本 '' 与版本 '8.0.0.0.0' 不一致
Error 704 happened during db open, shutting down database
USER (ospid: 3756): terminating the instance due to error 704
Instance terminated by USER, pid = 3756

通过分析确认,确认由于数据库遭受到而已脚本注入,破坏数据库基表数据,导致数据库重启之后就无法正常启动,
1


这类故障可以通过对oracle基表进行恢复,实现数据库完美open,数据0丢失.如果无法自行解决,需要恢复支持请联系我们
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com
2019年再次提醒:1)数据库做好备份容灾,2)请使用官方途径下载数据库介质和数据库访问工具,3)定期检查数据库是否被注入恶意脚本

Oracle Exadata坏盘导致磁盘组无法mount恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:Oracle Exadata坏盘导致磁盘组无法mount恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到朋友求救有客户oracle exadata一体机 的 asm磁盘组无法mount,希望我们提供恢复支持服务
经过分析和了解,大致问题是:磁盘空间已经超容量使用(部分数据不能完成ASM镜像),最近又损坏一块盘,导致asm 磁盘组无法mount。我们分析后,通过重构exadata celldisk数据,将asm 磁盘组 mount成功后,实现五套数据库全部open成功(由于底层磁盘部分数据损坏,导致部分数据访问报错,需要在oracle层面进行处理)。

本次问题的具体分析和处理如下:
存放数据库文件的磁盘组不能mount

Wed Dec 12 21:29:04 2018
SQL> alter diskgroup DATA_XFF mount force
NOTE: cache registered group DATA_XFF number=1 incarn=0x5fe882cb
NOTE: cache began mount (first) of group DATA_XFF number=1 incarn=0x5fe882cb
NOTE: Assigning number (1,36) to disk (o/192.168.10.5/DATA_XFF_CD_11_XFFCEL03)
NOTE: Assigning number (1,34) to disk (o/192.168.10.5/DATA_XFF_CD_10_XFFCEL03)
NOTE: Assigning number (1,37) to disk (o/192.168.10.5/DATA_XFF_CD_04_XFFCEL03)
NOTE: Assigning number (1,38) to disk (o/192.168.10.5/DATA_XFF_CD_00_XFFCEL03)
NOTE: Assigning number (1,39) to disk (o/192.168.10.5/DATA_XFF_CD_03_XFFCEL03)
NOTE: Assigning number (1,40) to disk (o/192.168.10.5/DATA_XFF_CD_05_XFFCEL03)
NOTE: Assigning number (1,41) to disk (o/192.168.10.5/DATA_XFF_CD_08_XFFCEL03)
NOTE: Assigning number (1,42) to disk (o/192.168.10.5/DATA_XFF_CD_01_XFFCEL03)
NOTE: Assigning number (1,43) to disk (o/192.168.10.5/DATA_XFF_CD_09_XFFCEL03)
NOTE: Assigning number (1,44) to disk (o/192.168.10.5/DATA_XFF_CD_06_XFFCEL03)
NOTE: Assigning number (1,45) to disk (o/192.168.10.5/DATA_XFF_CD_07_XFFCEL03)
NOTE: Assigning number (1,46) to disk (o/192.168.10.5/DATA_XFF_CD_02_XFFCEL03)
NOTE: Assigning number (1,22) to disk (o/192.168.10.4/DATA_XFF_CD_10_XFFCEL02)
NOTE: Assigning number (1,18) to disk (o/192.168.10.4/DATA_XFF_CD_06_XFFCEL02)
NOTE: Assigning number (1,19) to disk (o/192.168.10.4/DATA_XFF_CD_07_XFFCEL02)
NOTE: Assigning number (1,15) to disk (o/192.168.10.4/DATA_XFF_CD_03_XFFCEL02)
NOTE: Assigning number (1,20) to disk (o/192.168.10.4/DATA_XFF_CD_08_XFFCEL02)
NOTE: Assigning number (1,17) to disk (o/192.168.10.4/DATA_XFF_CD_05_XFFCEL02)
NOTE: Assigning number (1,16) to disk (o/192.168.10.4/DATA_XFF_CD_04_XFFCEL02)
NOTE: Assigning number (1,23) to disk (o/192.168.10.4/DATA_XFF_CD_11_XFFCEL02)
NOTE: Assigning number (1,12) to disk (o/192.168.10.4/DATA_XFF_CD_00_XFFCEL02)
NOTE: Assigning number (1,21) to disk (o/192.168.10.4/DATA_XFF_CD_09_XFFCEL02)
NOTE: Assigning number (1,13) to disk (o/192.168.10.4/DATA_XFF_CD_01_XFFCEL02)
NOTE: Assigning number (1,14) to disk (o/192.168.10.4/DATA_XFF_CD_02_XFFCEL02)
NOTE: Assigning number (1,1) to disk (o/192.168.10.3/DATA_XFF_CD_05_XFFCEL01)
NOTE: Assigning number (1,2) to disk (o/192.168.10.3/DATA_XFF_CD_03_XFFCEL01)
NOTE: Assigning number (1,3) to disk (o/192.168.10.3/DATA_XFF_CD_06_XFFCEL01)
NOTE: Assigning number (1,4) to disk (o/192.168.10.3/DATA_XFF_CD_09_XFFCEL01)
NOTE: Assigning number (1,5) to disk (o/192.168.10.3/DATA_XFF_CD_04_XFFCEL01)
NOTE: Assigning number (1,6) to disk (o/192.168.10.3/DATA_XFF_CD_07_XFFCEL01)
NOTE: Assigning number (1,7) to disk (o/192.168.10.3/DATA_XFF_CD_11_XFFCEL01)
NOTE: Assigning number (1,8) to disk (o/192.168.10.3/DATA_XFF_CD_01_XFFCEL01)
NOTE: Assigning number (1,9) to disk (o/192.168.10.3/DATA_XFF_CD_00_XFFCEL01)
NOTE: Assigning number (1,10) to disk (o/192.168.10.3/DATA_XFF_CD_10_XFFCEL01)
NOTE: Assigning number (1,11) to disk (o/192.168.10.3/DATA_XFF_CD_08_XFFCEL01)
Wed Dec 12 21:29:10 2018
NOTE: GMON heartbeating for grp 1
GMON querying group 1 at 101 for pid 27, osid 62541
NOTE: Assigning number (1,0) to disk ()
GMON querying group 1 at 102 for pid 27, osid 62541
NOTE: process _user62541_+asm2 (62541) initiating offline of disk 0.3915937355 () with mask 0x7e[0x7f] in group 1
NOTE: initiating PST update: grp = 1, dsk = 0/0xe968764b, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 103 for pid 27, osid 62541
ERROR: Disk 0 cannot be offlined, since all the disks [0, 25] with mirrored data would be offline.
ERROR: too many offline disks in PST (grp 1)
WARNING: Offline of disk 0 () in group 1 and mode 0x7f failed on ASM inst 2
NOTE: cache dismounting (not clean) group 1/0x5FE882CB (DATA_XFF)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0x5FE882CB (DATA_XFF)
NOTE: cache ending mount (fail) of group DATA_XFF number=1 incarn=0x5fe882cb
NOTE: cache deleting context for group DATA_XFF 1/0x5fe882cb
GMON dismounting group 1 at 104 for pid 27, osid 62541
ERROR: diskgroup DATA_XFF was not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15066: offlining disk "0" in group "DATA_XFF" may result in a data loss
ORA-15042: ASM disk "0" is missing from group number "1"
ERROR: alter diskgroup DATA_XFF mount force

检查底层损坏情况

CellCLI>   list physicaldisk
         20:0            KN3VZL          normal
         20:1            KNAWLL          normal
         20:2            KN4E4L          warning - predictive failure, poor performance
         20:3            KNAN5L          normal
         20:4            KMJKYL          normal
         20:5            KN5DGL          normal
         20:6            KMDLWL          normal
         20:7            KMDKPL          normal
         20:8            KMDA7L          normal
         20:9            KN1YJL          normal
         20:10           KMH1YL          normal
         20:11           KMVHAL          normal
CellCLI>   list griddisk
         DATA_XFF_CD_00_XFFCEL01       active
         DATA_XFF_CD_01_XFFCEL01       active
         DATA_XFF_CD_02_XFFCEL01       proactive failure
         DATA_XFF_CD_03_XFFCEL01       active
         DATA_XFF_CD_04_XFFCEL01       active
         DATA_XFF_CD_05_XFFCEL01       active
         DATA_XFF_CD_06_XFFCEL01       active
         DATA_XFF_CD_07_XFFCEL01       active
         DATA_XFF_CD_08_XFFCEL01       active
         DATA_XFF_CD_09_XFFCEL01       active
         DATA_XFF_CD_10_XFFCEL01       active
         DATA_XFF_CD_11_XFFCEL01       active

在db节点无法发现异常磁盘的asm disk

[grid@ycdwdb01 grid]$ kfod disk=all
--------------------------------------------------------------------------------
 Disk          Size Path                                     User     Group
============================================================
   1:     433152 Mb o/192.168.10.3/DATA_XFF_CD_00_XFFCEL01 <unknown> <unknown>
   2:     433152 Mb o/192.168.10.3/DATA_XFF_CD_01_XFFCEL01 <unknown> <unknown>
   3:     433152 Mb o/192.168.10.3/DATA_XFF_CD_03_XFFCEL01 <unknown> <unknown>
   4:     433152 Mb o/192.168.10.3/DATA_XFF_CD_04_XFFCEL01 <unknown> <unknown>
   5:     433152 Mb o/192.168.10.3/DATA_XFF_CD_05_XFFCEL01 <unknown> <unknown>
   6:     433152 Mb o/192.168.10.3/DATA_XFF_CD_06_XFFCEL01 <unknown> <unknown>
   7:     433152 Mb o/192.168.10.3/DATA_XFF_CD_07_XFFCEL01 <unknown> <unknown>
   8:     433152 Mb o/192.168.10.3/DATA_XFF_CD_08_XFFCEL01 <unknown> <unknown>
   9:     433152 Mb o/192.168.10.3/DATA_XFF_CD_09_XFFCEL01 <unknown> <unknown>
  10:     433152 Mb o/192.168.10.3/DATA_XFF_CD_10_XFFCEL01 <unknown> <unknown>
  11:     433152 Mb o/192.168.10.3/DATA_XFF_CD_11_XFFCEL01 <unknown> <unknown>

根据客户的反馈该磁盘组几乎全部被使用,asmcmd lsdg看到Usable_file_MB已经出现负值.证明该磁盘组本身的normal没有完全存储两份数据,在这样的情况下,继续坏盘会导致部分数据只有一份,因此也就出现了这里的磁盘组无法正常mount成功.

通过底层修复celldisk之后

CellCLI>  list griddisk
         DATA_XFF_CD_00_XFFCEL01       active
         DATA_XFF_CD_01_XFFCEL01       active
         DATA_XFF_CD_02_XFFCEL01       active
         DATA_XFF_CD_03_XFFCEL01       active
         DATA_XFF_CD_04_XFFCEL01       active
         DATA_XFF_CD_05_XFFCEL01       active
         DATA_XFF_CD_06_XFFCEL01       active
         DATA_XFF_CD_07_XFFCEL01       active
         DATA_XFF_CD_08_XFFCEL01       active
         DATA_XFF_CD_09_XFFCEL01       active
         DATA_XFF_CD_10_XFFCEL01       active
         DATA_XFF_CD_11_XFFCEL01       active
[grid@ycdwdb01 grid]$ kfod disk=all
--------------------------------------------------------------------------------
 Disk          Size Path                                     User     Group
============================================================
   1:     433152 Mb o/192.168.10.3/DATA_XFF_CD_00_XFFCEL01 <unknown> <unknown>
   2:     433152 Mb o/192.168.10.3/DATA_XFF_CD_01_XFFCEL01 <unknown> <unknown>
   3:     433152 Mb o/192.168.10.3/DATA_XFF_CD_02_XFFCEL01 <unknown> <unknown>
   4:     433152 Mb o/192.168.10.3/DATA_XFF_CD_03_XFFCEL01 <unknown> <unknown>
   5:     433152 Mb o/192.168.10.3/DATA_XFF_CD_04_XFFCEL01 <unknown> <unknown>
   6:     433152 Mb o/192.168.10.3/DATA_XFF_CD_05_XFFCEL01 <unknown> <unknown>
   7:     433152 Mb o/192.168.10.3/DATA_XFF_CD_06_XFFCEL01 <unknown> <unknown>
   8:     433152 Mb o/192.168.10.3/DATA_XFF_CD_07_XFFCEL01 <unknown> <unknown>
   9:     433152 Mb o/192.168.10.3/DATA_XFF_CD_08_XFFCEL01 <unknown> <unknown>
  10:     433152 Mb o/192.168.10.3/DATA_XFF_CD_09_XFFCEL01 <unknown> <unknown>
  11:     433152 Mb o/192.168.10.3/DATA_XFF_CD_10_XFFCEL01 <unknown> <unknown>
  12:     433152 Mb o/192.168.10.3/DATA_XFF_CD_11_XFFCEL01 <unknown> <unknown>

data磁盘组直接mount成功

Fri Dec 14 14:04:59 2018
SQL> alter diskgroup DATA_XFF mount
NOTE: cache registered group DATA_XFF number=1 incarn=0x78a886e7
NOTE: cache began mount (not first) of group DATA_XFF number=1 incarn=0x78a886e7
NOTE: Assigning number (1,36) to disk (o/192.168.10.5/DATA_XFF_CD_11_XFFCEL03)
NOTE: Assigning number (1,34) to disk (o/192.168.10.5/DATA_XFF_CD_10_XFFCEL03)
NOTE: Assigning number (1,37) to disk (o/192.168.10.5/DATA_XFF_CD_04_XFFCEL03)
NOTE: Assigning number (1,38) to disk (o/192.168.10.5/DATA_XFF_CD_00_XFFCEL03)
NOTE: Assigning number (1,39) to disk (o/192.168.10.5/DATA_XFF_CD_03_XFFCEL03)
NOTE: Assigning number (1,40) to disk (o/192.168.10.5/DATA_XFF_CD_05_XFFCEL03)
NOTE: Assigning number (1,41) to disk (o/192.168.10.5/DATA_XFF_CD_08_XFFCEL03)
NOTE: Assigning number (1,42) to disk (o/192.168.10.5/DATA_XFF_CD_01_XFFCEL03)
NOTE: Assigning number (1,43) to disk (o/192.168.10.5/DATA_XFF_CD_09_XFFCEL03)
NOTE: Assigning number (1,44) to disk (o/192.168.10.5/DATA_XFF_CD_06_XFFCEL03)
NOTE: Assigning number (1,45) to disk (o/192.168.10.5/DATA_XFF_CD_07_XFFCEL03)
NOTE: Assigning number (1,46) to disk (o/192.168.10.5/DATA_XFF_CD_02_XFFCEL03)
NOTE: Assigning number (1,22) to disk (o/192.168.10.4/DATA_XFF_CD_10_XFFCEL02)
NOTE: Assigning number (1,18) to disk (o/192.168.10.4/DATA_XFF_CD_06_XFFCEL02)
NOTE: Assigning number (1,19) to disk (o/192.168.10.4/DATA_XFF_CD_07_XFFCEL02)
NOTE: Assigning number (1,15) to disk (o/192.168.10.4/DATA_XFF_CD_03_XFFCEL02)
NOTE: Assigning number (1,20) to disk (o/192.168.10.4/DATA_XFF_CD_08_XFFCEL02)
NOTE: Assigning number (1,17) to disk (o/192.168.10.4/DATA_XFF_CD_05_XFFCEL02)
NOTE: Assigning number (1,16) to disk (o/192.168.10.4/DATA_XFF_CD_04_XFFCEL02)
NOTE: Assigning number (1,23) to disk (o/192.168.10.4/DATA_XFF_CD_11_XFFCEL02)
NOTE: Assigning number (1,12) to disk (o/192.168.10.4/DATA_XFF_CD_00_XFFCEL02)
NOTE: Assigning number (1,21) to disk (o/192.168.10.4/DATA_XFF_CD_09_XFFCEL02)
NOTE: Assigning number (1,13) to disk (o/192.168.10.4/DATA_XFF_CD_01_XFFCEL02)
NOTE: Assigning number (1,14) to disk (o/192.168.10.4/DATA_XFF_CD_02_XFFCEL02)
NOTE: Assigning number (1,1) to disk (o/192.168.10.3/DATA_XFF_CD_05_XFFCEL01)
NOTE: Assigning number (1,2) to disk (o/192.168.10.3/DATA_XFF_CD_03_XFFCEL01)
NOTE: Assigning number (1,3) to disk (o/192.168.10.3/DATA_XFF_CD_06_XFFCEL01)
NOTE: Assigning number (1,4) to disk (o/192.168.10.3/DATA_XFF_CD_09_XFFCEL01)
NOTE: Assigning number (1,5) to disk (o/192.168.10.3/DATA_XFF_CD_04_XFFCEL01)
NOTE: Assigning number (1,6) to disk (o/192.168.10.3/DATA_XFF_CD_07_XFFCEL01)
NOTE: Assigning number (1,7) to disk (o/192.168.10.3/DATA_XFF_CD_11_XFFCEL01)
NOTE: Assigning number (1,8) to disk (o/192.168.10.3/DATA_XFF_CD_01_XFFCEL01)
NOTE: Assigning number (1,9) to disk (o/192.168.10.3/DATA_XFF_CD_00_XFFCEL01)
NOTE: Assigning number (1,10) to disk (o/192.168.10.3/DATA_XFF_CD_10_XFFCEL01)
NOTE: Assigning number (1,11) to disk (o/192.168.10.3/DATA_XFF_CD_08_XFFCEL01)
NOTE: Assigning number (1,0) to disk (o/192.168.10.3/DATA_XFF_CD_02_XFFCEL01)
Fri Dec 14 14:04:59 2018
GMON querying group 1 at 78 for pid 28, osid 76016
NOTE: Assigning number (1,24) to disk ()
NOTE: Assigning number (1,25) to disk ()
NOTE: Assigning number (1,26) to disk ()
NOTE: Assigning number (1,27) to disk ()
NOTE: Assigning number (1,28) to disk ()
NOTE: Assigning number (1,29) to disk ()
NOTE: Assigning number (1,30) to disk ()
NOTE: Assigning number (1,31) to disk ()
NOTE: Assigning number (1,32) to disk ()
NOTE: Assigning number (1,33) to disk ()
NOTE: Assigning number (1,35) to disk ()
GMON querying group 1 at 79 for pid 28, osid 76016
NOTE: cache opening disk 0 of grp 1: DATA_XFF_CD_02_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_02_XFFCEL01
NOTE: cache opening disk 1 of grp 1: DATA_XFF_CD_05_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_05_XFFCEL01
NOTE: cache opening disk 2 of grp 1: DATA_XFF_CD_03_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_03_XFFCEL01
NOTE: F1X0 found on disk 2 au 5 fcn 0.15948262
NOTE: cache opening disk 3 of grp 1: DATA_XFF_CD_06_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_06_XFFCEL01
NOTE: cache opening disk 4 of grp 1: DATA_XFF_CD_09_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_09_XFFCEL01
NOTE: cache opening disk 5 of grp 1: DATA_XFF_CD_04_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_04_XFFCEL01
NOTE: cache opening disk 6 of grp 1: DATA_XFF_CD_07_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_07_XFFCEL01
NOTE: cache opening disk 7 of grp 1: DATA_XFF_CD_11_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_11_XFFCEL01
NOTE: cache opening disk 8 of grp 1: DATA_XFF_CD_01_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_01_XFFCEL01
NOTE: cache opening disk 9 of grp 1: DATA_XFF_CD_00_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_00_XFFCEL01
NOTE: cache opening disk 10 of grp 1: DATA_XFF_CD_10_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_10_XFFCEL01
NOTE: cache opening disk 11 of grp 1: DATA_XFF_CD_08_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_08_XFFCEL01
NOTE: cache opening disk 12 of grp 1: DATA_XFF_CD_00_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_00_XFFCEL02
NOTE: cache opening disk 13 of grp 1: DATA_XFF_CD_01_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_01_XFFCEL02
NOTE: cache opening disk 14 of grp 1: DATA_XFF_CD_02_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_02_XFFCEL02
NOTE: cache opening disk 15 of grp 1: DATA_XFF_CD_03_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_03_XFFCEL02
NOTE: cache opening disk 16 of grp 1: DATA_XFF_CD_04_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_04_XFFCEL02
NOTE: cache opening disk 17 of grp 1: DATA_XFF_CD_05_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_05_XFFCEL02
NOTE: cache opening disk 18 of grp 1: DATA_XFF_CD_06_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_06_XFFCEL02
NOTE: cache opening disk 19 of grp 1: DATA_XFF_CD_07_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_07_XFFCEL02
NOTE: cache opening disk 20 of grp 1: DATA_XFF_CD_08_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_08_XFFCEL02
NOTE: cache opening disk 21 of grp 1: DATA_XFF_CD_09_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_09_XFFCEL02
NOTE: F1X0 found on disk 21 au 2 fcn 0.15948262
NOTE: cache opening disk 22 of grp 1: DATA_XFF_CD_10_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_10_XFFCEL02
NOTE: cache opening disk 23 of grp 1: DATA_XFF_CD_11_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_11_XFFCEL02
NOTE: cache opening disk 36 of grp 1: DATA_XFF_CD_11_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_11_XFFCEL03
NOTE: cache opening disk 37 of grp 1: DATA_XFF_CD_04_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_04_XFFCEL03
NOTE: cache opening disk 38 of grp 1: DATA_XFF_CD_00_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_00_XFFCEL03
NOTE: cache opening disk 39 of grp 1: DATA_XFF_CD_03_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_03_XFFCEL03
NOTE: cache opening disk 40 of grp 1: DATA_XFF_CD_05_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_05_XFFCEL03
NOTE: cache opening disk 41 of grp 1: DATA_XFF_CD_08_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_08_XFFCEL03
NOTE: cache opening disk 42 of grp 1: DATA_XFF_CD_01_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_01_XFFCEL03
NOTE: cache opening disk 43 of grp 1: DATA_XFF_CD_09_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_09_XFFCEL03
NOTE: cache opening disk 44 of grp 1: DATA_XFF_CD_06_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_06_XFFCEL03
NOTE: F1X0 found on disk 44 au 2 fcn 0.15948262
NOTE: cache opening disk 45 of grp 1: DATA_XFF_CD_07_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_07_XFFCEL03
NOTE: cache opening disk 46 of grp 1: DATA_XFF_CD_02_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_02_XFFCEL03
NOTE: cache mounting (not first) normal redundancy group 1/0x78A886E7 (DATA_XFF)
Fri Dec 14 14:04:59 2018
kjbdomatt send to inst 2
Fri Dec 14 14:04:59 2018
NOTE: attached to recovery domain 1
NOTE: redo buffer size is 512 blocks (2101760 bytes)
Fri Dec 14 14:04:59 2018
NOTE: LGWR attempting to mount thread 2 for diskgroup 1 (DATA_XFF)
NOTE: LGWR found thread 2 closed at ABA 98.4672
NOTE: LGWR mounted thread 2 for diskgroup 1 (DATA_XFF)
NOTE: LGWR opening thread 2 at fcn 0.18931129 ABA 99.4673
NOTE: cache mounting group 1/0x78A886E7 (DATA_XFF) succeeded
NOTE: cache ending mount (success) of group DATA_XFF number=1 incarn=0x78a886e7
GMON querying group 1 at 80 for pid 19, osid 9805
Fri Dec 14 14:04:59 2018
NOTE: Instance updated compatible.asm to 11.2.0.3.0 for grp 1
SUCCESS: diskgroup DATA_XFF was mounted
SUCCESS: alter diskgroup DATA_XFF mount

恢复后的asm磁盘状态

ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  Y         512   4096  4194304  15160320  4776184          5197824         -210820             12             N  DATA_XFF/
MOUNTED  NORMAL  N         512   4096  4194304    864896   863400           298240          282580              0             Y  DBFS_DG/
MOUNTED  NORMAL  N         512   4096  4194304   3787840  2157232          1298688          429272              0             N  RECO_XFF/

后续数据库open成功,有部分坏块通过技术手段进行二次处理,至此数据库恢复完成,成功抢救了客户Oracle Exadata中的绝大部分数据.如果有类似xd故障恢复,无法自行解决,需要恢复支持请联系我们
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

ORA-00704 ORA-00604 ORA-00942故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-00704 ORA-00604 ORA-00942故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到客户请求,oracle数据库停机重启维护之后,无法正常启动,请求我们给予协助
数据库启动报ORA-00704 ORA-00604 ORA-00942错误

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for 32-bit Windows: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
SQL> startup open
ORACLE 例程已经启动。
Total System Global Area 1288949760 bytes
Fixed Size                  1376520 bytes
Variable Size             377491192 bytes
Database Buffers          897581056 bytes
Redo Buffers               12500992 bytes
数据库装载完毕。
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-00942: table or view does not exist
进程 ID: 2756
会话 ID: 5 序列号: 9

alert日志报错

SMON: enabling cache recovery
Errors in file d:\app\administrator\diag\rdbms\xff\xff\trace\xff_ora_2756.trc:
ORA-00704: 引导程序进程失败
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-00942: 表或视图不存在
Errors in file d:\app\administrator\diag\rdbms\xff\xff\trace\xff_ora_2756.trc:
ORA-00704: 引导程序进程失败
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-00942: 表或视图不存在
Error 704 happened during db open, shutting down database
USER (ospid: 2756): terminating the instance due to error 704
Instance terminated by USER, pid = 2756
ORA-1092 signalled during: ALTER DATABASE OPEN...
opiodr aborting process unknown ospid (2756) as a result of ORA-1092
Fri Nov 30 12:51:26 2018
ORA-1092 : opitsk aborting process

根据这些年的恢复经验,恢复过相关错误的主要有:
ORA-01092 ORA-00704 ORA-00942
Oracle 11g丢失access$恢复方法
11.1.0.7版本也会出现access$表丢失导致数据库无法启动
总结主要两类:1. 由于某种bug导致access$表丢失的故障,另外一种是由于坏块导致数据库核心基表损坏引起,对于这个库进行分析

dbv 检查坏块

D:\APP\ADMINISTRATOR\ORADATA\ORCL> dbv file=system01.dbf
DBVERIFY: Release 10.2.0.3.0 - Production on 星期五 11月 30 15:11:24 2018
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
DBVERIFY - 开始验证: FILE = system01.dbf
DBVERIFY - 验证完成
检查的页总数: 93440
处理的页总数 (数据): 61979
失败的页总数 (数据): 0
处理的页总数 (索引): 13560
失败的页总数 (索引): 0
处理的页总数 (其它): 3067
处理的总页数 (段)  : 0
失败的总页数 (段)  : 0
空的页总数: 14834
标记为损坏的总页数: 0
流入的页总数: 0
最高块 SCN            : 659587683 (0.659587683)

通过dbv检查数据库,确定system没有坏块
10046跟踪数据库启动

SQL> startup mount;
ORACLE 例程已经启动。
Total System Global Area 1288949760 bytes
Fixed Size                  1376520 bytes
Variable Size             377491192 bytes
Database Buffers          897581056 bytes
Redo Buffers               12500992 bytes
数据库装载完毕。
SQL> oradebug setmypid
已处理的语句
SQL> alter session set db_file_multiblocK_read_count=1;
会话已更改。
SQL> ALTER SESSION SET EVENTS '704 trace name errorstack level 3';
会话已更改。
SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12
已处理的语句
SQL> oradebug TRACEFILE_NAME
d:\app\administrator\diag\rdbms\xff\xff\trace\xff_ora_1132.trc
--trace信息
PARSE ERROR #3:len=208 dep=1 uid=0 oct=9 lid=0 tim=5838844893 err=942
CREATE UNIQUE INDEX I_OBJ1 ON OBJ$(OBJ#,OWNER#,TYPE#) PCTFREE 10 INITRANS 2 MAXTRANS 255
STORAGE (  INITIAL 64K NEXT 1024K MINEXTENTS 1 MAXEXTENTS 2147483645
 PCTINCREASE 0 OBJNO 36 EXTENTS (FILE 1 BLOCK 336))
*** 2018-11-30 13:18:07.593
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0)
----- Error Stack Dump -----
ORA-00704: 引导程序进程失败
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-00942: 表或视图不存在

通过这一步基本上可以判断是由于obj$表丢失导致数据库创建I_OBJ1 index不成功,从而使得数据库无法正常启动。通过一些技巧修复出来obj$表,尝试启动数据库

SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [6], [], [], [], [], [], [],
[], [], [], []
进程 ID: 836
会话 ID: 355 序列号: 6027

alert日志报错

Fri Nov 30 15:33:47 2018
SMON: enabling cache recovery
Errors in file d:\app\administrator\diag\rdbms\xff\xff\trace\xff_ora_836.trc  (incident=648694):
ORA-00600: 内部错误代码, 参数: [4000], [6], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\xff\xff\incident\incdir_648694\xff_ora_836_i648694.trc
Errors in file d:\app\administrator\diag\rdbms\xff\xff\trace\xff_ora_836.trc:
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [4000], [6], [], [], [], [], [], [], [], [], [], []
Errors in file d:\app\administrator\diag\rdbms\xff\xff\trace\xff_ora_836.trc:
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [4000], [6], [], [], [], [], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 836): terminating the instance due to error 704
Fri Nov 30 15:33:57 2018
Instance terminated by USER, pid = 836
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (836) as a result of ORA-1092

发现数据库启动报ORA-00704和ORA-00600: internal error code, arguments: [4000], [6], [], [], [], [], [], [],错误,根据以往经验,该问题是由于回滚段异常导致
分析trace信息

Dump continued from file: d:\app\administrator\diag\rdbms\xff\xff\trace\xff_ora_2124.trc
ORA-00600: 内部错误代码, 参数: [4000], [6], [], [], [], [], [], [], [], [], [], []
========= Dump for incident 651102 (ORA 600 [4000]) ========
*** 2018-11-30 15:46:32.125
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for txff session (sql_id=6apq2rjyxmxpj) -----
select line#, sql_text from bootstrap$ where obj# != :1
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
_skdstdst()+121      CALLrel  _kgdsdst()           E6B1614 2
_ksedst1()+93        CALLrel  _skdstdst()          E6B1614 0 1 436646 435BE2
                                                   436646
_ksedst()+49         CALLrel  _ksedst1()           0 1
_dbkedDefDump()+367  CALLrel  _ksedst()            0
2
_ksedmp()+44         CALLrel  _dbkedDefDump()      3 2
_ksfdmp()+56         CALLrel  _ksedmp()            3EB
_dbgexPhaseII()+164  CALLreg  00000000             C378A28 3EB
_dbgexProcessError(  CALLrel  _dbgexPhaseII()      E2B0454 E2B6E50 E6B6078
)+2061
_dbgeExecuteForErro  CALLrel  _dbgexProcessError(  E2B0454 E2B6E50 1 0 E2B0454
r()+43                        )                    E2B6E50
__VInfreq__dbgePost  CALLrel  _dbgeExecuteForErro  E2B0454 E2B6E50 0 1 0
ErrorKGE()+260                r()
_dbkePostKGE_kgsf()  CALLrel  _dbgePostErrorKGE()  C378A28 E2BD274 258
+56
_kgeade()+299        CALLreg  00000000             C378A28 E2BD274 258
_kgeriv_int()+79     CALLrel  _kgeade()            C378A28 C378B50 E2BD274 258 0
                                                   FA0 0 0 0 1 E6B68A8
_kgeriv()+22         CALLrel  _kgeriv_int()        C378A28 E2BD274 FA0 0 1
                                                   E6B68A8
_kgeasi()+107        CALLrel  _kgeriv()            C378A28 E2BD274 FA0 1 E6B68A8
__VInfreq__ktuGetUs  CALLrel  _kgeasi()            C378A28 E2BD274 FA0 2 1 0 6 0
egDba()+123
_ktrgcm()+5147       CALLrel  _ktuGetUsegDba()     6 E6B6E78 0 0 E6B6F48 0
_ktrget2()+596       CALLrel  _ktrgcm()            F9FCEAC
_kdst_fetch()+816    CALLrel  _ktrget2()           F9FCEAC F9FCE24 303 0
_kdstf11001010000km  CALLrel  _kdst_fetch()        1 F9FCEA8 E6B71C8
()+2806
_kdsttgr()+5944      CALLrel  _kdstf11001010000km  F9FCEA8 0 557B044C F9FCDF8
                              ()                   2F0C3A6 E6B7894
_qertbFetch()+767    CALLrel  _kdsttgr()           F9FCEA8 0 557B044C F9FCDF8
                                                   557B0498 2F0C3A6 E6B7894 1
_opifch2()+2729      CALLptr  00000000             E6E7228 0 0 2 26180001
_opifch()+53         CALLrel  _opifch2()           89 5 E6B7A04
_opiodr()+1248       CALLreg  00000000             5 2 E6B81FC
_rpidrus()+186       CALLrel  _opiodr()            5 2 E6B81FC 2
_rpidru()+90         CALLrel  _rpidrus()           E6B7D58
_rpiswu2()+557       CALLrel  _rpidru()            E6B813C
_rpidrv()+1242       CALLrel  _rpiswu2()           6D9A295C 0 6D9A29A8 2 E6B8184
                                                   0 6D9A2A28 0 0 544632 5448D6
                                                   E6B813C 8
_rpifch()+43         CALLrel  _rpidrv()            2 5 E6B81FC 8
_kqlbebs()+1213      CALLrel  _rpifch()            2 2 2 F013232 FA0 1 0 E6B8634
                                                   0 0 0 0 0
_kqlblfc()+175       CALLrel  _kqlbebs()           0 E6BBBD4
_adbdrv()+16992      CALLrel  _kqlblfc()           0 E6BBBD4
_opiexe()+13594      CALLrel  _adbdrv()            4A 6E6CBC48 6DDEDB6C E6BBD68
                                                   6D60697 6E6CBC48
_opiosq0()+6248      CALLrel  _opiexe()            4 0 E6BC734
_kpooprx()+277       CALLrel  _opiosq0()           3 E E6BC9A0 A4 0
_kpoal8()+632        CALLrel  _kpooprx()           E6BF0A4 E6BD420 1B 1 0 A4
_opiodr()+1248       CALLreg  00000000             5E 1C E6BF0A0
_ttcpip()+1051       CALLreg  00000000             5E 1C E6BF0A0 0
_opitsk()+1404       CALL???  00000000             C3832A8 5E E6BF0A0 0 E6BED30
                                                   E6BF1CC 53E52E 0 E6BF1F8
_opiino()+980        CALLrel  _opitsk()            0 0
_opiodr()+1248       CALLreg  00000000             3C 4 E6BFBF4
_opidrv()+1201       CALLrel  _opiodr()            3C 4 E6BFBF4 0
_sou2o()+55          CALLrel  _opidrv()            3C 4 E6BFBF4
_opimai_real()+124   CALLrel  _sou2o()             E6BFC04 3C 4 E6BFBF4
_opimai()+125        CALLrel  _opimai_real()       2 E6BFC2C
_OracleThreadStart@  CALLrel  _opimai()            2 E6BFF6C 7C9BA7F4 E6BFC34 0
4()+830                                            E6BFD04
7C82484C             CALLreg  00000000             E5BFF9C 0 0 E5BFF9C 0 E6BFFC4
00000000             CALL???  00000000
--------------------- Binary Stack Dump ---------------------
Block header dump:  0x0040020b
 Object id on Block? Y
 seg/obj: 0x3b  csc: 0x00.27508136  itc: 1  flg: O  typ: 1 - DATA
     fsl: 0  fnx: 0x0 ver: 0x01
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0006.015.0005e9e9  0x00c0052c.916f.10  --U-    1  fsc 0x0000.27508263

报错比较明显该block比较异常,通过bbed修改相关block信息,相关处理参考:
重现ORA-600 4000异常
通过bbed解决ORA-00600[4000]案例
记录一次ORA-600 4000数据库故障恢复
ORACLE 8.1.7 数据库ORA-600 4000故障恢复

处理之后数据库正常open

Fri Nov 30 15:57:34 2018
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 2.
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Fri Nov 30 15:57:38 2018
Starting background process QMNC
Fri Nov 30 15:57:38 2018
QMNC started with pid=23, OS id=1152
Completed: alter database open

文件系统损坏导致数据文件异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:文件系统损坏导致数据文件异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天接到一个客户的服务请求,由于服务器被强制重启,数据库无法启动
ORA-1200报错
ora-1200


这是一个常见的ORA-1200错误,但是文件大小相差的有离谱实际大小729600个block,但是现在只有149760个block.不像是一般主机重启导致的数据库异常.通过再次咨询客户实际是什么情况,客户那边如实描述:三个磁盘的raid 5由于两个盘掉线,后来使用最后一个好的盘和最后掉线的盘,强制拉起raid,系统启动通过长时间的自检之后,就是出现这样的情况.通过进一步检查发现,发现大多数数据文件异常.
文件系统中数据文件截图
dbf-fs


通过查询数据库确定文件大小情况
df_header_check


对比文件系统中的文件和数据库查询结果,可以发现绿色框中的文件全部大于文件系统中文件,蓝色框中的文件在文件系统中丢失.对于这样的情况,由于被强制online的磁盘中有坏道,导致文件系统损坏,从而出现大量文件大小异常甚至文件丢失;或者是由于选择错了上线的磁盘出现该问题(通过分析存在的文件,判断确定上线的盘没有问题),那就是文件系统故障导致.
底层重组恢复
解决该问题的方法是通过底层block恢复,重组oracle数据文件,并且检查恢复文件坏块情况.参考:Oracle 数据文件大小为0kb或者文件丢失恢复
scan-disk


分析文件坏块原因
block-had


由于文件所在block被覆盖或者磁盘坏道导致这部分block直接被空块填充.
这个客户比较幸运,直接open数据库成功,坏坏块进行分析判断业务表空间数据文件的坏块全部为index,业务数据0丢失.

restore database误操作恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:restore database误操作恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到朋友恢复请求,由于客户的误操作(生产库遇到问题,无法正常open,在没有确认备份和归档日志完全正常的情况下,直接restore database操作),导致原库数据文件被备份中的数据文件还原,而且中间归档丢失,无法一直应用归档把数据文件追到最新.通过Oracle Database Recovery Check脚本check数据库结果如下
datafile-header


这里很明显红色框中数据文件被还原,发现一个可喜的地方,被还原的文件相对都比较小,也就是说相对覆盖的部分可能不会太多.通过alert日志确认数据被rman还原覆盖
restore-datafile


然后继续应用日志恢复
archive-lost


悲剧发生了明显thread 2的sequence 297归档日志发现丢失,进一步发现这个rac(2节点)一共丢失了300个左右的归档日志.到这一步常规方法来说,基本上无解,放弃数据库恢复,因为基于现在的情况客户2014-07-27 07:18:12创建的,目前只是恢复到了2014-10-20 08:26:43日期的数据,而且通过数据库现状判断,数据库主要是写入file 4,也就是users表空间的第一个数据文件中,而该文件已经被还原,强制拉库没有意义.考虑通过bbed修改文件头,跳过相关丢失的归档,继续应用以后的归档,因为跳过的归档太多,基本上不可行(跳过这些丢失的归档之后,后面的归档继续应用会报错).基于这样的情况,抱着一丝丝的希望,rman备份是业务上线之后备份(这样被还原的system里面就有完整的业务数据字典信息).通过对system文件进行分析,确认我们比较幸运,业务数据创建于2014-8月份,而且根据客户的反馈之后基本上没有变更.参考:asm disk header 彻底损坏恢复方法扫描效果比较好,file 4 的觉大部分block均存在
scan-asm


然后通过这些block重组成数据文件然后结合以前的system文件,强制open库,基本上实现客户数据的绝大部分恢复.
export-data


类似案例还有:
asm磁盘组误删除文件恢复
又一例asm格式化文件系统恢复
Oracle 数据文件大小为0kb或者文件丢失恢复
oracle asm disk格式化恢复—格式化为ntfs文件系统
oracle asm disk格式化恢复—格式化为ext4文件系统
alter database create datafile 导致数据文件丢失恢复
分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例