Oracle备份恢复 – 第26页

ORA-600 k2vcbk_2 故障恢复

有朋友找到我说他们数据库无法启动，数据库启动报ORA-600[k2vcbk_2]错误，数据库版本为11.2.0.2 RAC，操作系统是AIX 6.1

SQL> recover database;
Media recovery complete.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [],
[], [], [], [], []
Process ID: 7930020
Session ID: 49 Serial number: 14761

数据库节点1日志

Mon Sep 21 15:45:41 2015
Thread 1 advanced to log sequence 54076 (LGWR switch)
  Current log# 13 seq# 54076 mem# 0: +DG01/xifenfei/onlinelog/group_13.332.779459035
  Current log# 13 seq# 54076 mem# 1: +DG01/xifenfei/onlinelog/group_13.344.779582621
Mon Sep 21 15:45:44 2015
Archived Log entry 74655 added for thread 1 sequence 54075 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:56:18 2015
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_18088342.trc  (incident=184348):
ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184348/xifenfei1_ora_18088342_i184348.trc
Mon Sep 21 15:56:34 2015
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Error 600 trapped in 2PC on transaction 7.16.120119. Cleaning up.
Error stack returned to user:
ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_18088342.trc  (incident=184349):
ORA-00603: ORACLE 服务器会话因致命错误而终止
ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
Mon Sep 21 15:56:34 2015
Dumping diagnostic data in directory=[cdmp_20150921155634], requested by (instance=1, osid=18088342), summary=[incident=184348].
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184349/xifenfei1_ora_18088342_i184349.trc
Mon Sep 21 15:56:35 2015
Sweep [inc][184349]: completed
Sweep [inc][184348]: completed
Sweep [inc2][184348]: completed
opiodr aborting process unknown ospid (18088342) as a result of ORA-603
Mon Sep 21 15:57:12 2015
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_smon_7536810.trc  (incident=184274):
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184274/xifenfei1_smon_7536810_i184274.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Sep 21 15:57:16 2015
Dumping diagnostic data in directory=[cdmp_20150921155716], requested by (instance=1, osid=7536810 (SMON)), summary=[incident=184274].
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_smon_7536810.trc:
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 7536810): terminating the instance due to error 474
Mon Sep 21 15:57:18 2015
ORA-1092 : opitsk aborting process

数据库节点2日志

Mon Sep 21 15:21:50 2015
Archived Log entry 74653 added for thread 2 sequence 23559 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:44:28 2015
Thread 2 advanced to log sequence 23561 (LGWR switch)
  Current log# 12 seq# 23561 mem# 0: +DG01/xifenfei/onlinelog/group_12.338.779457003
  Current log# 12 seq# 23561 mem# 1: +DG01/xifenfei/onlinelog/group_12.265.779582493
Mon Sep 21 15:44:31 2015
Archived Log entry 74654 added for thread 2 sequence 23560 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:45:31 2015
DISTRIB TRAN xifenfei.1ebab0a5.20.3.1533822
  is local tran 20.3.1533822 (hex=14.03.17677e)
  insert pending committed tran, scn=14590688068086 (hex=d45.28c781f6)
Mon Sep 21 15:45:31 2015
DISTRIB TRAN xifenfei.1ebab0a5.20.3.1533822
  is local tran 20.3.1533822 (hex=14.03.17677e))
  delete pending committed tran, scn=14590688068086 (hex=d45.28c781f6)
Mon Sep 21 15:56:35 2015
Dumping diagnostic data in directory=[cdmp_20150921155634], requested by (instance=1, osid=18088342), summary=[incident=184348].
Mon Sep 21 15:57:10 2015
Error 3135 trapped in 2PC on transaction 20.11.1534704. Cleaning up.
Error stack returned to user:
ORA-03135: 连接失去联系
opidcl aborting process unknown ospid (9175532) as a result of ORA-604
Mon Sep 21 15:57:17 2015
Dumping diagnostic data in directory=[cdmp_20150921155716], requested by (instance=1, osid=7536810 (SMON)), summary=[incident=184274].
Mon Sep 21 15:57:23 2015
Reconfiguration started (old inc 18, new inc 20)
List of instances:
 2 (myinst: 2)
 Global Resource Directory frozen
 * dead instance detected - domain 0 invalid = TRUE
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Mon Sep 21 15:57:23 2015
 LMS 2: 3 GCS shadows cancelled, 1 closed, 0 Xw survived
Mon Sep 21 15:57:23 2015
 LMS 0: 2 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Sep 21 15:57:23 2015
 LMS 1: 3 GCS shadows cancelled, 1 closed, 0 Xw survived
 Set master node info
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
Mon Sep 21 15:57:23 2015
minact-scn: Inst 2 is now the master inc#:20 mmon proc-id:6816208 status:0x7
minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0d45.28c2bb5c gcalc-scn:0x0d45.28c3bd2e
minact-scn: master found reconf/inst-rec before recscn scan old-inc#:20 new-inc#:20
Mon Sep 21 15:57:23 2015
Instance recovery: looking for dead threads
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
Beginning instance recovery of 1 threads
 parallel recovery started with 31 processes
Started redo scan
Completed redo scan
 read 12626 KB redo, 1724 data blocks need recovery
Started redo application at
 Thread 1: logseq 54076, block 184416
Recovery of Online Redo Log: Thread 1 Group 13 Seq 54076 Reading mem 0
  Mem# 0: +DG01/xifenfei/onlinelog/group_13.332.779459035
  Mem# 1: +DG01/xifenfei/onlinelog/group_13.344.779582621
Completed redo application of 9.78MB
Completed instance recovery at
 Thread 1: logseq 54076, block 209669, scn 14590688357285
 1633 data blocks read, 1794 data blocks written, 12626 redo k-bytes read
Thread 1 advanced to log sequence 54077 (thread recovery)
Mon Sep 21 15:57:33 2015
Error 3113 trapped in 2PC on transaction 21.18.1965522. Cleaning up.
Redo thread 1 internally disabled at seq 54077 (SMON)
Error stack returned to user:
ORA-02050: 事务处理 21.18.1965522 已回退, 某些远程数据库可能有问题
ORA-03113: 通信通道的文件结尾
ORA-02063: 紧接着 line (起自 ZSK)
Mon Sep 21 15:57:34 2015
Archived Log entry 74656 added for thread 1 sequence 54076 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:57:34 2015
ARC0: Archiving disabled thread 1 sequence 54077
Archived Log entry 74657 added for thread 1 sequence 54077 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:57:35 2015
Thread 2 advanced to log sequence 23562 (LGWR switch)
  Current log# 8 seq# 23562 mem# 0: +DG01/xifenfei/onlinelog/group_8.334.779456945
  Current log# 8 seq# 23562 mem# 1: +DG01/xifenfei/onlinelog/group_8.267.779582453
Mon Sep 21 15:57:36 2015
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_smon_6750672.trc  (incident=200218):
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei2/incident/incdir_200218/xifenfei2_smon_6750672_i200218.trc
Archived Log entry 74658 added for thread 2 sequence 23561 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:57:38 2015
minact-scn: master continuing after IR
Mon Sep 21 15:57:41 2015
Dumping diagnostic data in directory=[cdmp_20150921155741], requested by (instance=2, osid=6750672 (SMON)), summary=[incident=200218].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fatal internal error happened while SMON was doing instance transaction recovery.
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_smon_6750672.trc:
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 6750672): terminating the instance due to error 474
Mon Sep 21 15:57:41 2015
ORA-1092 : opitsk aborting process
Mon Sep 21 15:57:42 2015
ORA-1092 : opitsk aborting process
Mon Sep 21 15:57:42 2015
License high water mark = 291
Instance terminated by SMON, pid = 6750672
USER (ospid: 18874814): terminating the instance

通过数据库日志大概可以看出来，由于节点2的分布式事事务异常，而在11.2.0.2中，分布式事务跨节点，引起节点2的pmon清理异常事务，但是由于bug，使得异常事务无法被清理掉，从而引起节点1 crash,节点1 crash之后节点2进行恢复，也因为分布式事务bug，导致smon回滚失败，实例也crash。无法进行回滚导致数据库无法正常启动，通过查询mos发现定位到是Bug 10222544 ORA-600 [k2vpci_2] from multi-branch distributed transaction
ORA-600-k2vpci_2

对于这类问题，由于分布事务无法清理,处理方法就是找出来事务人工提交或者直接屏蔽掉这个事务解决该问题

system01.dbf文件被offline,导致数据库报ORA-01245 ORA-01110故障恢复

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：system01.dbf文件被offline,导致数据库报ORA-01245 ORA-01110故障恢复

有朋友找到我,说数据库做recover报ORA-01245和ORA-01110错误,无法继续恢复,请求支持

SQL> recover database using backup controlfile until cancel;
…………
第 1 行出现错误:
ORA-01245: RESETLOGS 完成时脱机文件 1 将丢失
ORA-01110: 数据文件 1: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\SYSTEM01.DBF'

通过Oracle Database Recovery Check检查数据库情况,发现datafile 1处于offline状态
oracle_recovery_check

Wed Aug 26 23:11:00 2015
alter database datafile 1 offline drop
Completed: alter database datafile 1 offline drop

从这里基本上可以知道为什么出现ORA-01245错误了,由于system表空间中文件被offline导致.

redo信息
oracle_recovery_check_redo

Mon Aug 24 22:38:35 2015
alter database clear unarchived logfile group 2
Clearing online log 2 of thread 1 sequence number 5705
Completed: alter database clear unarchived logfile group 2
Wed Aug 26 23:13:23 2015
alter database clear logfile group 3
Clearing online log 3 of thread 1 sequence number 5706
Completed: alter database clear logfile group 3

除当前redo之外,其他redo被clear

尝试恢复

SQL> alter database datafile 1 online;
数据库已更改。
SQL> recover database;
ORA-00283: 恢复会话因错误而取消
ORA-01610: 使用 BACKUP CONTROLFILE 选项的恢复必须已完成
SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570
5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO03.LOG
ORA-00310: 归档日志包含序列 5706; 要求序列 5705
ORA-00334: 归档日志: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO03.LOG'
SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570
5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO02.LOG
ORA-00339: 归档日志未包含任何重做
ORA-00334: 归档日志: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO02.LOG'
SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570
5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO01.LOG
ORA-00310: 归档日志包含序列 5707; 要求序列 5705
ORA-00334: 归档日志: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO01.LOG'

数据库做恢复需要seq 5705的redo,但是redo已经被clear,导致现在数据库常规手段无法恢复,只用使用隐含参数屏蔽数据库前滚(一致性检查)

再次尝试打开数据库

ORACLE 例程已经启动。
Total System Global Area  778387456 bytes
Fixed Size                  1374808 bytes
Variable Size             486540712 bytes
Database Buffers          285212672 bytes
Redo Buffers                5259264 bytes
数据库装载完毕。
SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570
5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
介质恢复已取消。
SQL> alter database open resetlogs;
数据库已更改。

在数据库恢复中,请不要对system表空间数据文件进行offline操作,如果对此类文件进行offline操作,讲在数据库恢复过程中出现ORA-01245和ORA-01110错误,而且文件还会出现SYSOFF状态

aix平台 ORA-01115 ORA-01110 ORA-27067 故障恢复

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：aix平台 ORA-01115 ORA-01110 ORA-27067 故障恢复

接到朋友恢复请求，aix 5.3,Oracle 10.2.0.1平台，数据库启动报ORA-01115 ORA-01110 ORA-27067错误，数据库无法正常打开，通过分析，是由于10201在aix上面的bug导致，通过技巧规避，完美解决给问题，数据0丢失
数据库报错alert日志

Mon Aug 10 13:25:22 2015
ALTER DATABASE   MOUNT
Mon Aug 10 13:25:29 2015
Setting recovery target incarnation to 1
Mon Aug 10 13:25:29 2015
Successful mount of redo thread 1, with mount id 432339141
Mon Aug 10 13:25:29 2015
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Mon Aug 10 13:25:36 2015
alter database open
Mon Aug 10 13:25:36 2015
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Mon Aug 10 13:25:37 2015
Started redo scan
Mon Aug 10 13:25:52 2015
Completed redo scan
 7889582 redo blocks read, 75305 data blocks need recovery
Mon Aug 10 13:25:53 2015
Errors in file /dc/admin/datacent/bdump/datacent_p002_144124.trc:
ORA-01115: IO error reading block from file 2 (block # 40704)
ORA-01110: data file 2: '/dc/oradata/datacent/undotbs01.dbf'
ORA-27067: size of I/O buffer is invalid
Additional information: 2
Additional information: 1572864
Mon Aug 10 13:25:53 2015
Aborting crash recovery due to slave death, attempting serial crash recovery
Mon Aug 10 13:25:53 2015
Beginning crash recovery of 1 threads
Mon Aug 10 13:25:53 2015
Started redo scan
Mon Aug 10 13:26:09 2015
Completed redo scan
 7889582 redo blocks read, 75305 data blocks need recovery
Mon Aug 10 13:26:12 2015
Aborting crash recovery due to error 1115
Mon Aug 10 13:26:12 2015
Errors in file /dc/admin/datacent/udump/datacent_ora_123384.trc:
ORA-01115: IO error reading block from file 2 (block # 39077)
ORA-01110: data file 2: '/dc/oradata/datacent/undotbs01.dbf'
ORA-27067: size of I/O buffer is invalid
Additional information: 2
Additional information: 1310720
ORA-1115 signalled during: alter database open...

这里报的前面两个错误ORA-01115 ORA-01110我们都非常熟悉，类似数据库启动遇到坏块或者io错误之时可能就会报如此错误。但是ORA-27067确实不多见，从mos上看，很多是由于rman备份之时的bug可能导致该错误。

dbv检测undo坏块文件

DBVERIFY: Release 10.2.0.1.0 - Production on Mon Aug 10 23:18:15 2015
Copyright (c) 1982, 2003, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /dc/oradata/datacent/undotbs01.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 329600
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 327504
Total Pages Processed (Seg)  : 17
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2096
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1887888 (0.1887888)

这里可以看到，undo文件本身并没有逻辑和物理的坏块,证明因为数据库异常的原因，可能是由于ORA-27067: size of I/O buffer is invalid导致。根据官方文档ORA-01115 ORA-27067 DURING PARALLEL INSTANCE RECOVERY AFTER INSTANCE CRASH中的解释，我们基本上可以确定很可能是由于10.2.0.1在aix平台的jfs2系统中，由于大量事务操作，突然abort掉数据库（也可能断电），从而数据库在启动的时候进行实例恢复，而由于内部的bug，导致实例恢复无法成功。通过我们处理后的，数据库完美启动，数据0丢失

数据库启动日志

Mon Aug 10 16:34:14 2015
alter database open
Mon Aug 10 16:34:14 2015
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Mon Aug 10 16:34:14 2015
Started redo scan
Mon Aug 10 16:34:27 2015
Completed redo scan
 7889582 redo blocks read, 0 data blocks need recovery
Mon Aug 10 16:34:27 2015
Started redo application at
 Thread 1: logseq 664704, block 1286922
Mon Aug 10 16:34:27 2015
Recovery of Online Redo Log: Thread 1 Group 4 Seq 664704 Reading mem 0
  Mem# 0 errs 0: /dev/rredo04
Mon Aug 10 16:34:32 2015
Recovery of Online Redo Log: Thread 1 Group 5 Seq 664705 Reading mem 0
  Mem# 0 errs 0: /dev/rredo05
Mon Aug 10 16:34:38 2015
Recovery of Online Redo Log: Thread 1 Group 6 Seq 664706 Reading mem 0
  Mem# 0 errs 0: /dev/rredo06
Mon Aug 10 16:34:40 2015
Completed redo application
Mon Aug 10 16:34:40 2015
Completed crash recovery at
 Thread 1: logseq 664706, block 1017805, scn 8554793334
 0 data blocks read, 0 data blocks written, 7889582 redo blocks read
Mon Aug 10 16:34:40 2015
Thread 1 advanced to log sequence 664707
Thread 1 opened at log sequence 664707
  Current log# 1 seq# 664707 mem# 0: /dev/rredo01
Successful open of redo thread 1
Mon Aug 10 16:34:40 2015
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Aug 10 16:34:40 2015
SMON: enabling cache recovery
Mon Aug 10 16:34:40 2015
Successfully onlined Undo Tablespace 1.
Mon Aug 10 16:34:40 2015
SMON: enabling tx recovery
Mon Aug 10 16:34:41 2015
Database Characterset is ZHS32GB18030
replication_dependency_tracking turned off (no async multimaster replication found)
WARNING: AQ_TM_PROCESSES is set to 0. System operation might be adversely affected.
Mon Aug 10 16:34:41 2015
SMON: Parallel transaction recovery tried
Mon Aug 10 16:34:42 2015
db_recovery_file_dest_size of 2048 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Mon Aug 10 16:34:42 2015
Completed: alter database open

ORA-600 kccpb_sanity_check_2故障恢复

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：ORA-600 kccpb_sanity_check_2故障恢复

今天是有人在淘宝旺旺上找我，需要oracle数据库恢复支持
wangwang

远程登录上去一看发现数据库mount的时候报ORA-600[kccpb_sanity_check_2]错误

C:\Documents and Settings\Administrator>sqlplus / as sysdba
SQL*Plus: Release 10.2.0.3.0 - Production on Wed Jul 29 16:23:18 2015
Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options
SQL> alter database mount;
alter database mount
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kccpb_sanity_check_2], [14169],
[14160], [0x0], [], [], [], []

尝试重建控制文件

SQL> shutdown immediate;
ORA-01507: database not mounted
ORACLE instance shut down.
SQL> startup pfile='D:\database\m104\pfile\init.ora' nomount
ORACLE instance started.
Total System Global Area  444596224 bytes
Fixed Size                  1291072 bytes
Variable Size             155192512 bytes
Database Buffers          281018368 bytes
Redo Buffers                7094272 bytes
SQL> SHOW PARAMETER CONT;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
control_file_record_keep_time        integer     7
control_files                        string      D:\DATABASE\M104\CTRL\CONTROL0
                                                 2.CTL
global_context_pool_size             string
SQL> ALTER DATABASE MOUNT;
ALTER DATABASE MOUNT
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kccpb_sanity_check_2], [14169],
[14160], [0x0], [], [], [], []
SQL>
SQL> CREATE CONTROLFILE REUSE DATABASE "m104_db" NORESETLOGS  FORCE LOGGING NOAR
CHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 2921
  7  LOGFILE
  8    GROUP 1 'D:\database\m104\log\redo01.log'  SIZE 51200K,
  9    GROUP 2 'D:\database\m104\log\redo02.log'  SIZE 51200K,
 10    GROUP 3 'D:\database\m104\log\redo03.log'  SIZE 51200K
 11  DATAFILE
 12    'd:\database\m104\data\system01.dbf',
 13    'd:\database\m104\data\sysaux01.dbf',
 14    'd:\database\m104\data\USERS01.DBF',
 15    'd:\database\m104\data\UNDOTBS01.DBF',
 16    'd:\database\m104\data\INDX01.DBF'
 17  CHARACTER SET WE8ISO8859P1
 18  ;
CREATE CONTROLFILE REUSE DATABASE "m104_db" NORESETLOGS  FORCE LOGGING NOARCHIVE
LOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-00600: internal error code, arguments: [kccsga_update_ckpt_4], [1], [8],
[], [], [], [], []
SQL>
SQL> CREATE CONTROLFILE REUSE DATABASE "m104_db" RESETLOGS  FORCE LOGGING NOARCH
IVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 2921
  7  LOGFILE
  8    GROUP 1 'D:\database\m104\log\redo01.log'  SIZE 51200K,
  9    GROUP 2 'D:\database\m104\log\redo02.log'  SIZE 51200K,
 10    GROUP 3 'D:\database\m104\log\redo03.log'  SIZE 51200K
 11  DATAFILE
 12    'd:\database\m104\data\system01.dbf',
 13    'd:\database\m104\data\sysaux01.dbf',
 14    'd:\database\m104\data\USERS01.DBF',
 15    'd:\database\m104\data\UNDOTBS01.DBF',
 16    'd:\database\m104\data\INDX01.DBF'
 17  CHARACTER SET WE8ISO8859P1
 18  ;
CREATE CONTROLFILE REUSE DATABASE "m104_db" RESETLOGS  FORCE LOGGING NOARCHIVELO
G
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-00600: internal error code, arguments: [kccsga_update_ckpt_4], [1], [8],
[], [], [], [], []

无论是使用noresetlogs还是resetlogs，重建控制文件都报ORA-600[kccsga_update_ckpt_4]错误.比较奇怪，无解指定控制文件新名称重建试试看

修改控制文件路径

SQL> SHUTDOWN ABORT
ORACLE instance shut down.
SQL> startup pfile='D:\database\m104\pfile\init.ora' nomount
ORACLE instance started.
Total System Global Area  444596224 bytes
Fixed Size                  1291072 bytes
Variable Size             155192512 bytes
Database Buffers          281018368 bytes
Redo Buffers                7094272 bytes
SQL> SHOW PARAMETER CONT;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
control_file_record_keep_time        integer     7
control_files                        string      D:\DATABASE\M104\CTRL\CONTROL0
                                                 4.CTL
global_context_pool_size             string
SQL> CREATE CONTROLFILE REUSE DATABASE "m104_db" RESETLOGS  FORCE LOGGING NOARCH
IVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 2921
  7  LOGFILE
  8    GROUP 1 'D:\database\m104\log\redo01.log'  SIZE 51200K,
  9    GROUP 2 'D:\database\m104\log\redo02.log'  SIZE 51200K,
 10    GROUP 3 'D:\database\m104\log\redo03.log'  SIZE 51200K
 11  DATAFILE
 12    'd:\database\m104\data\system01.dbf',
 13    'd:\database\m104\data\sysaux01.dbf',
 14    'd:\database\m104\data\USERS01.DBF',
 15    'd:\database\m104\data\UNDOTBS01.DBF',
 16    'd:\database\m104\data\INDX01.DBF'
 17  CHARACTER SET WE8ISO8859P1
 18  ;
Control file created.

使用新的控制文件位置，这次终于数据库重建控制文件成功
尝试指定redo进行恢复，数据库正常打开

SQL> RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL;
ORA-00279: change 3643108240801 generated at 07/26/2015 20:15:22 needed for
thread 1
ORA-00289: suggestion :
D:\ORACLE\PRODUCT\10.2.0\DB_1\RDBMS\ARC00567_0866390669.001
ORA-00280: change 3643108240801 for thread 1 is in sequence #567
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
D:\database\m104\log\redo01.log
ORA-00310: archived log contains sequence 566; sequence 567 required
ORA-00334: archived log: 'D:\DATABASE\M104\LOG\REDO01.LOG'
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: 'D:\DATABASE\M104\DATA\SYSTEM01.DBF'
SQL> RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL;
ORA-00279: change 3643108240801 generated at 07/26/2015 20:15:22 needed for
thread 1
ORA-00289: suggestion :
D:\ORACLE\PRODUCT\10.2.0\DB_1\RDBMS\ARC00567_0866390669.001
ORA-00280: change 3643108240801 for thread 1 is in sequence #567
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
D:\database\m104\log\redo02.log
Log applied.
Media recovery complete.
SQL> ALTER DATABASE OPEN RESETLOGS;
Database altered.

数据库恢复完成。整个数据库恢复比较简单，但是注意这里的ORA-600[kccsga_update_ckpt_4]通过修改控制文件路径规避，具体原因待查。

知识点补充：ORA-600 [kccpb_sanity_check_2] [a] [b] {c}

VERSIONS:
  Versions 10.2 to 11.2
DESCRIPTION:
  This internal error is raised when the sequence number (seq#) of the
  current block of the controlfile is greater than the seq# in the controlfile header.
  The header value should always be equal to, or greater than the value
  held in the control file block(s).
  This extra check was introduced in Oracle 10gR2 to detect lost writes
  or stale reads to the header.
ARGUMENTS:
  Arg [a] seq# in control block header.
  Arg [b] seq# in the control file header.
  Arg {c}

Oracle异常恢复前备份保护现场建议—ASM环境

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：Oracle异常恢复前备份保护现场建议—ASM环境

在上一篇中写道了文件系统的库,在进行异常恢复前的备份方法（Oracle异常恢复前备份保护现场建议—FileSystem环境）,对于asm库,因为asm 里面的数据文件无法直接dd文件头,因此备份方式也有所改变.对于asm是mount,但是数据库不能打开,使用rman或者asm的cp命令全部备份数据文件也来不及或者空间不足,这样的情况下,你可以考虑使用rman或者cp命令备份控制文件和system表空间文件,cp命令备份redo,dd命令备份文件头,来完成asm情况下数据库异常恢复前备份

控制文件备份
11.2及其以后版本使用asmcmd cp命令处理

select 'asmcmd cp '||name||' &&backup_dir/' from v$datafile where ts#=0
union all
select 'asmcmd cp '||name||' &&backup_dir/crontrofile_'||rownum||'.ctl' from v$controlfile
union all
select 'asmcmd cp '||member||' &&backup_dir/'||thread#||'_'||a.group#||'_'||sequence#||'_'||substr(member,
instr(member,'/',-1)+1)  FROM v$log a, v$logfile b WHERE a.group# = B.GROUP#;

其他版本使用rman命令处理

--rman备份控制文件(/tmp目录自己修改)
copy current controlfile to '/tmp/ctl.ctl';
--rman备份system表空间
select 'copy datafile '||file#||' to ''&backup_dir/system_'||file#||'.dbf'';'
from v$datafile where ts#=0;
--redo无法直接备份

备份文件头

[grid@xifenfei ~]$ ss
SQL*Plus: Release 11.2.0.4.0 Production on Fri May 1 04:15:18 2015
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Automatic Storage Management option
SQL> set lines 150
SQL> select 'dd if='||c.PATH_KFDSK||' of=&&backup_path/'||a.GROUP_KFFXP||'_'||a.disk_kffxp||'_'||
  2  b.NUMBER_KFFIL||'.asm count=1 bs='|| d.AUSIZE_KFGRP||' skip='||a.au_kffxp backup_dd_cmd
  3   FROM x$kffxp a, X$KFFIL  b,X$KFDSK c,X$KFGRP d  WHERE
  4  a.GROUP_KFFXP=b.GROUP_KFFIL
  5  and a.NUMBER_KFFXP=b.NUMBER_KFFIL
  6  and b.FTYPE_KFFIL in(2,12)
  7  and b.NUMBER_KFFIL>255
  8  and a.xnum_kffxp=0
  9  and a.GROUP_KFFXP=c.GRPNUM_KFDSK
 10  and a.disk_kffxp=c.NUMBER_KFDSK
 11  and a.GROUP_KFFXP=d.NUMBER_KFGRP;
Enter value for backup_path: /tmp
old   1: select 'dd if='||c.PATH_KFDSK||' of=&&backup_path/'||a.GROUP_KFFXP||'_'||a.disk_kffxp||'_'||
new   1: select 'dd if='||c.PATH_KFDSK||' of=/tmp/'||a.GROUP_KFFXP||'_'||a.disk_kffxp||'_'||
BACKUP_DD_CMD
------------------------------------------------------------------------------------------------------------------
dd if=/dev/asm-disk1 of=/tmp/1_0_256.asm count=1 bs=1048576 skip=29
dd if=/dev/asm-disk2 of=/tmp/1_1_257.asm count=1 bs=1048576 skip=404
dd if=/dev/asm-disk2 of=/tmp/1_1_258.asm count=1 bs=1048576 skip=641
dd if=/dev/asm-disk1 of=/tmp/1_0_259.asm count=1 bs=1048576 skip=648
dd if=/dev/asm-disk3 of=/tmp/2_0_256.asm count=1 bs=1048576 skip=51

还原文件头

SQL> set lines 150
SQL> select 'dd of='||c.PATH_KFDSK||' if=&&backup_path/'||a.GROUP_KFFXP||'_'||a.disk_kffxp||
  2  '_'||b.NUMBER_KFFIL||'.asm count=1 conv=notrunc bs='|| d.AUSIZE_KFGRP||' seek='||a.au_kffxp restore_dd_cmd
  3   FROM x$kffxp a, X$KFFIL  b,X$KFDSK c,X$KFGRP d  WHERE
  4  a.GROUP_KFFXP=b.GROUP_KFFIL
  5  and a.NUMBER_KFFXP=b.NUMBER_KFFIL
  6  and b.FTYPE_KFFIL in(2,12)
  7  and b.NUMBER_KFFIL>255
  8  and a.xnum_kffxp=0
  9  and a.GROUP_KFFXP=c.GRPNUM_KFDSK
 10  and a.disk_kffxp=c.NUMBER_KFDSK
 11  and a.GROUP_KFFXP=d.NUMBER_KFGRP;
old   1: select 'dd of='||c.PATH_KFDSK||' if=&&backup_path/'||a.GROUP_KFFXP||'_'||a.disk_kffxp||
new   1: select 'dd of='||c.PATH_KFDSK||' if=/tmp/'||a.GROUP_KFFXP||'_'||a.disk_kffxp||
RESTORE_DD_CMD
-----------------------------------------------------------------------------------------------------------------
dd of=/dev/asm-disk1 if=/tmp/1_0_256.asm count=1 conv=notrunc bs=1048576 seek=29
dd of=/dev/asm-disk2 if=/tmp/1_1_257.asm count=1 conv=notrunc bs=1048576 seek=404
dd of=/dev/asm-disk2 if=/tmp/1_1_258.asm count=1 conv=notrunc bs=1048576 seek=641
dd of=/dev/asm-disk1 if=/tmp/1_0_259.asm count=1 conv=notrunc bs=1048576 seek=648
dd of=/dev/asm-disk3 if=/tmp/2_0_256.asm count=1 conv=notrunc bs=1048576 seek=51
SQL>

备份还原文件头测试–通过测试证明该方法备份文件头是ok的
关闭数据库,使用dd备份文件头

[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Fri May 1 04:21:49 2015
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

dul查看当前dbname值为XIFENFEI

[oracle@xifenfei dul]$ ./dul
Data UnLoader: 10.2.0.6.5 - Internal Only - on Fri May  1 04:37:43 2015
with 64-bit io functions
Copyright (c) 1994 2015 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
Disk group DATA, dul group_cid 0
Discovered disk /dev/asm-disk1 as diskgroup DATA, disk number 0 size 3922 Mb File1 starts at 2, dul_disk_cid 0
Discovered disk /dev/asm-disk2 as diskgroup DATA, disk number 1 size 3922 Mb without File1 meta data, dul_disk_cid 1
Disk group XIFENFEI, dul group_cid 1
Discovered disk /dev/asm-disk3 as diskgroup XIFENFEI, disk number 0 size 4439 Mb File1 starts at 2, dul_disk_cid 2
DUL: Warning: Dictionary cache DC_ASM_EXTENTS is empty
Probing for attributes in File9, the attribute directory, for disk group DATA
attribute name "_extent_sizes", value "1 4 16"
attribute name "_extent_counts", value "20000 20000 2147483647"
Oracle data file size 775954432 bytes, block size 8192
Found db_id = 1495013434
Found db_name = XIFENFEI   <-----db name
DUL: Error: Filedir block not allocated, file does not exist
DUL: Error: Could not load asm meta data for group XIFENFEI file 9
Probing for filenames in File6, the alias directory, for disk group XIFENFEI
+XIFENFEI/XIFENFEI/DATAFILE/XIFENFEI.256.878397315
Probing for database datafiles in File1, the file directory,  for disk group XIFENFEI
File 256 datafile size 104865792, block size 8192
Disk group XIFENFEI has one file of type datafile

使用dd备份1文件头

[oracle@xifenfei tmp]$ dd if=/dev/asm-disk1 of=/tmp/1_0_256.asm count=1 bs=1048576 skip=29
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0168209 seconds, 62.3 MB/s

尝试把dbname从XIFENFEI修改为ORCL

SQL> select dump('XIFENFEI',16) from dual;
DUMP('XIFENFEI',16)
-------------------------------------
Typ=96 Len=8: 58,49,46,45,4e,46,45,49
SQL> SELECT DUMP('ORCL',16) FROM DUAL;
DUMP('ORCL',16)
-------------------------
Typ=96 Len=4: 4f,52,43,4c
SQL>

bbed修改XIFENFEI为ORCL

[oracle@xifenfei tmp]$ bbed filename='/tmp/1_0_256.asm' mode=edit
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Fri May 1 04:24:06 2015
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set blocksize 8192
        BLOCKSIZE       8192
BBED> set block 1
        BLOCK#          1
BBED> map
 File: /tmp/1_0_256.asm (0)
 Block: 1                                     Dba:0x00000000
------------------------------------------------------------
 Data File Header
 struct kcvfh, 860 bytes                    @0
 ub4 tailchk                                @8188
BBED> p kcvfhhdr
struct kcvfhhdr, 76 bytes                   @20
   ub4 kccfhswv                             @20       0x00000000
   ub4 kccfhcvn                             @24       0x0b200400
   ub4 kccfhdbi                             @28       0x591c183a
   text kccfhdbn[0]                         @32      X
   text kccfhdbn[1]                         @33      I
   text kccfhdbn[2]                         @34      F
   text kccfhdbn[3]                         @35      E
   text kccfhdbn[4]                         @36      N
   text kccfhdbn[5]                         @37      F
   text kccfhdbn[6]                         @38      E
   text kccfhdbn[7]                         @39      I
BBED> d seek 32
 File: /tmp/1_0_256.asm (0)
 Block: 1                seeks:   32 to   63           Dba:0x00000000
------------------------------------------------------------------------
 58494645 4e464549 12040000 00720100 00200000 01000300 00000000 00000000
 <32 bytes per line>

dd把修改的block还原到asm中

[oracle@xifenfei dul]$ dd of=/dev/asm-disk1 if=/tmp/1_0_256.asm count=1 conv=notrunc bs=1048576 seek=29
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00253244 seconds, 414 MB/s

dul验证dbname 修改为ORCL成功

[oracle@xifenfei dul]$ ./dul
Data UnLoader: 10.2.0.6.5 - Internal Only - on Fri May  1 04:41:33 2015
with 64-bit io functions
Copyright (c) 1994 2015 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
Disk group DATA, dul group_cid 0
Discovered disk /dev/asm-disk1 as diskgroup DATA, disk number 0 size 3922 Mb File1 starts at 2, dul_disk_cid 0
Discovered disk /dev/asm-disk2 as diskgroup DATA, disk number 1 size 3922 Mb without File1 meta data, dul_disk_cid 1
Disk group XIFENFEI, dul group_cid 1
Discovered disk /dev/asm-disk3 as diskgroup XIFENFEI, disk number 0 size 4439 Mb File1 starts at 2, dul_disk_cid 2
DUL: Warning: Dictionary cache DC_ASM_EXTENTS is empty
Probing for attributes in File9, the attribute directory, for disk group DATA
attribute name "_extent_sizes", value "1 4 16"
attribute name "_extent_counts", value "20000 20000 2147483647"
Oracle data file size 775954432 bytes, block size 8192
Found db_id = 1495013434
Found db_name = ORCL   <----修改后的dbname
DUL: Error: Filedir block not allocated, file does not exist
DUL: Error: Could not load asm meta data for group XIFENFEI file 9
Probing for filenames in File6, the alias directory, for disk group XIFENFEI
+XIFENFEI/XIFENFEI/DATAFILE/XIFENFEI.256.878397315
Probing for database datafiles in File1, the file directory,  for disk group XIFENFEI
File 256 datafile size 104865792, block size 8192
Disk group XIFENFEI has one file of type datafile

对于asm无法mount情况下备份asm disk header
asm磁盘的备份主要是备份磁盘头100M空间,使用dd命令直接备份

set lines 150
set pages 1000
select 'dd if='||path||' of=&asmbackup_dir/'||group_number||'_'||disk_number||'.asm bs=1048576
count=100' from v$asm_disk;

set lines 150
set pages 1000
select 'dd of='||path||' if=&asmbackup_dir/'||group_number||'_'||disk_number||'.asm bs=1048576
count=100 conv=notrunc' from v$asm_disk;

asmlib需要注意把ORCL:替换为/dev/oracleasm/disks/对应目录.

Oracle异常恢复前备份保护现场建议—FileSystem环境

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：Oracle异常恢复前备份保护现场建议—FileSystem环境

无论是在各种会议上,还是在朋友/网友私下请教Oracle数据库恢复的问题之时,我都强调,如果你没有十足的把握,请你对您的现场进行备份,确保别对现场进行二次损坏。你不能恢复数据库,但绝对不能再次破坏数据库,给二次恢复增加难度.这里对恢复前备份提供一些指导思想和简单脚本,希望对大家有帮助.

哪些文件需要备份
熟悉数据库恢复的朋友可能都情况,Oracle在异常恢复的过程中主要修改的是system表空间里面数据,其他数据文件,redo数据,控制文件(当然由于redo,undo导致其他数据文件内部的block也可能发生改变)。在备份时间,备份空间允许的情况下,是对这些文件全部备份为好

完整备份文件

set lines 150
set pages 10000
select name from v$datafile
union all
select name from v$controlfile
union all
select member from v$logfile;

有些情况下:比如如果全部备份时间过长,备份空间不足等情况下,我们该如何备份,尽量减少因为异常恢复导致对原环境的损坏.备份最核心的system表空间,数据文件头,redo file,control file等数据,由于这个不是简单的拷贝操作,因此在生成备份语句同时,也生成还原语句,切不可生成了备份语句后,无恢复语句,导致后面还原故障现场难度增大.

无法全备情况下linux/unix数据库恢复前备份

set lines 150
set pages 10000
select 'dd if='||name||' of=&&back_dir/'||ts#||'_'||file#||'.dbf bs=1048576 count=10'
from v$datafile where ts#<>0
union all
select 'dd if='||name||' of=&&back_dir/'||ts#||'_'||file#||'.dbf' from v$datafile where ts#=0
union all
select 'dd if='||name||' of=&&back_dir/control0'||rownum||'.ctl' from v$controlfile
union all
select 'dd if='||member||' of=&&back_dir/'||thread#||'_'||a.group#||'_'||sequence#||'_'||substr(member,
instr(member,'/',-1)+1)  FROM v$log a, v$logfile b WHERE a.group# = B.GROUP#;

无法全备情况下linux/unix使用备份还原

set lines 150
set pages 1000
select 'dd of='||name||' if=&&back_dir/'||ts#||'_'||file#||'.dbf bs=1048576 count=10 conv=notrunc'
from v$datafile where ts#<>0
union all
select 'dd if='||name||' if=&&back_dir/'||ts#||'_'||file#||'.dbf' from v$datafile where ts#=0
union all
select 'dd of='||name||' if=&&back_dir/control0'||rownum||'.ctl' from v$controlfile
union all
select 'dd of='||member||' if=&&back_dir/'||thread#||'_'||a.group#||'_'||sequence#||'_'||substr(member,
instr(member,'/',-1)+1)    FROM v$log a, v$logfile b WHERE a.group# = B.GROUP#;

由于win路径斜杠不一样(/和\的区别),因此在无法全备情况下win备份语句

set lines 150
set pages 10000
select 'dd if='||name||' of=&&back_dir\'||ts#||'_'||file#||'.dbf bs=1048576 count=10'
from v$datafile where ts#<>0
union all
select 'dd if='||name||' of=&&back_dir\'||ts#||'_'||file#||'.dbf' from v$datafile where ts#=0
union all
select 'dd if='||name||' of=&&back_dir\control0'||rownum||'.ctl' from v$controlfile
union all
select 'dd if='||member||' of=&&back_dir\'||thread#||'_'||a.group#||'_'||sequence#||'_'||substr(member,
instr(member,'\',-1)+1)   FROM v$log a, v$logfile b WHERE a.group# = B.GROUP#;

在无法全备情况下win还原语句

set lines 150
set pages 1000
select 'dd of='||name||' if=&&back_dir\'||ts#||'_'||file#||'.dbf bs=1048576 count=10 conv=notrunc'
from v$datafile where ts#<>0
union all
select 'dd if='||name||' if=&&back_dir\'||ts#||'_'||file#||'.dbf' from v$datafile where ts#=0
union all
select 'dd of='||name||' if=&&back_dir\control0'||rownum||'.ctl' from v$controlfile
union all
select 'dd of='||member||' if=&&back_dir\'||thread#||'_'||a.group#||'_'||sequence#||'_'||substr(member,
instr(member,'\',-1)+1)    FROM v$log a, v$logfile b WHERE a.group# = B.GROUP#;

这里提供win环境下dd命令程序win环境dd命令工具

备注:对于asm情况异常情况恢复,备份情况请不要参考该文章,具体请见后续文章,具体见Oracle异常恢复前备份保护现场建议—ASM环境

使用bbed 修复I_OBJ4 index 报ORA-8102错误

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：使用bbed 修复I_OBJ4 index 报ORA-8102错误

数据库执行创建表操作报ORA-8102错误

SQL> startup
ORACLE instance started.
Total System Global Area 1570009088 bytes
Fixed Size                  2253584 bytes
Variable Size             469765360 bytes
Database Buffers         1090519040 bytes
Redo Buffers                7471104 bytes
Database mounted.
Database opened.
SQL> create table t1 as select * from dba_users;
create table t1 as select * from dba_users
                                 *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 87404, file 1, block 97266 (2)

分析ORA-8102错误

SQL> col OBJECT_NAME for a30
SQL> select object_name,object_type from dba_objects where object_id=87404;
OBJECT_NAME                    OBJECT_TYPE
------------------------------ -------------------
I_OBJ4                         INDEX
SQL> select /*+ index(t i_obj4) */ DATAOBJ#,type#,owner# from obj$  t
minus
  2    3  select /*+ full(t1) */ DATAOBJ#,type#,owner# from obj$  t1;
  DATAOBJ#      TYPE#     OWNER#
---------- ---------- ----------
     87420          0          0
SQL> select /*+ full(t1) */ DATAOBJ#,type#,owner# from obj$  t1
  2  minus
  3  select /*+ index(t i_obj4) */ DATAOBJ#,type#,owner# from obj$  t
  4  ;
  DATAOBJ#      TYPE#     OWNER#
---------- ---------- ----------
     87422          0          0
SQL> alter system dump datafile 1 block 97266;
System altered.
SQL>  select value from v$diag_info where name='Default Trace File';
VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_27037.trc
SQL> ALTER SESSION SET EVENTS '802 trace name errorstack level 3';
Session altered.
SQL> create table t1 as select * from dual;
create table t1 as select * from dual
                                 *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 87404, file 1, block 97266 (2)
SQL>  select value from v$diag_info where name='Default Trace File';
VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_27037.trc
*** 2015-03-14 14:46:33.640
kdk key 8102.2:
  ncol: 4, len: 16
  key: (16):  04 c3 09 4b 17 01 80 01 80 06 00 41 7f 25 00 28
  mask: (4096):
*** 2015-03-14 14:46:33.644
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0)
----- Error Stack Dump -----
----- Current SQL Statement for this session (sql_id=4yyb4104skrwj) -----
update obj$ set obj#=:4, type#=:5,ctime=:6,mtime=:7,stime=:8,status=:9,dataobj#=:10,flags=:11,
oid$=:12,spare1=:13, spare2=:14 where owner#=:1 and name=:2 and namespace=:3 and
remoteowner is null and linkname is null and subname is null

这里可以的出来由于obj$中的dataobj#为87422,而i_obj4中的dataobj#为87420,因此两者不一致。
另外通过相关trace发现,在创建表操作中会调用update obj$的一个递归操作,而该操作会更新dataobj#,但是由于该值在表和index中不匹配,因此出现ORA-08102导致创建表不成功

使用bbed 修复ORA-8102

[oracle@localhost ~]$ bbed blocksize=8192 mode=edit filename='/u01/app/oracle/oradata/xifenfei/system01.dbf'
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Sat Mar 14 14:55:22 2015
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set block 97266
        BLOCK#          97266
BBED> f /x 04c3
 File: /u01/app/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 97266            Offsets: 2714 to 3225           Dba:0x00000000
------------------------------------------------------------------------
 04c3094a 5f02c115 01800600 417f2500 0f000204 c3094b14 02c11501 80060041
 7f25000e 000204c3 094b1202 c1140180 0600417f 25000d00 0004c309 4b150180
 <32 bytes per line>
BBED> f
 File: /u01/app/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 97266            Offsets: 2733 to 3244           Dba:0x00000000
------------------------------------------------------------------------
 04c3094b 1402c115 01800600 417f2500 0e000204 c3094b12 02c11401 80060041
 7f25000d 000004c3 094b1501 80018006 00417f25 00280100 04c3094b 10018001
 <32 bytes per line>
BBED> f
 File: /u01/app/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 97266            Offsets: 2752 to 3263           Dba:0x00000000
------------------------------------------------------------------------
 04c3094b 1202c114 01800600 417f2500 0d000004 c3094b15 01800180 0600417f
 25002801 0004c309 4b100180 01800600 417f2500 28000004 c3094b08 02c10201
 <32 bytes per line>
BBED> f
 File: /u01/app/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 97266            Offsets: 2771 to 3282           Dba:0x00000000
------------------------------------------------------------------------
 04c3094b 15018001 80060041 7f250028 010004c3 094b1001 80018006 00417f25
 00280000 04c3094b 0802c102 01800600 417f2500 24000004 c3094b09 02c10201
 <32 bytes per line>
BBED> f
 File: /u01/app/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 97266            Offsets: 2789 to 3300           Dba:0x00000000
------------------------------------------------------------------------
 04c3094b 10018001 80060041 7f250028 000004c3 094b0802 c1020180 0600417f
 25002400 0004c309 4b0902c1 02018006 00417f25 00250000 04c3094b 0a02c103
 <32 bytes per line>
BBED> set count 32
        COUNT           32
BBED> set offset 2771
        OFFSET          2771
BBED> d
 File: /u01/app/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 97266            Offsets: 2771 to 2802           Dba:0x00000000
------------------------------------------------------------------------
 04c3094b 15018001 80060041 7f250028 010004c3 094b1001 80018006 00417f25
 <32 bytes per line>
BBED> set offset +4
        OFFSET          2775
BBED> d
 File: /u01/app/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 97266            Offsets: 2775 to 2806           Dba:0x00000000
------------------------------------------------------------------------
 15018001 80060041 7f250028 010004c3 094b1001 80018006 00417f25 00280000
 <32 bytes per line>
BBED> m /x 17
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/app/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 97266            Offsets: 2775 to 2806           Dba:0x00000000
------------------------------------------------------------------------
 17018001 80060041 7f250028 010004c3 094b1001 80018006 00417f25 00280000
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 97266:
current = 0x7955, required = 0x7955
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/app/oracle/oradata/xifenfei/system01.dbf
BLOCK = 97266
Block 97266 is corrupt
Corrupt block relative dba: 0x00417bf2 (file 0, block 97266)
Fractured block found during verification
Data in bad block:
 type: 6 format: 2 rdba: 0x00417bf2
 last change scn: 0x0000.00102ed8 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x2ed80602
 check value in block header: 0x7955
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
BBED> set offset 8188
        OFFSET          8188
BBED> d
 File: /u01/app/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 97266            Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0206d82e
 <32 bytes per line>
BBED> m /x 01
 File: /u01/app/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 97266            Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0106d82e
 <32 bytes per line>
BBED> sum
Check value for File 0, Block 97266:
current = 0x7955, required = 0x7956
BBED> sum apply
Check value for File 0, Block 97266:
current = 0x7956, required = 0x7956
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/app/oracle/oradata/xifenfei/system01.dbf
BLOCK = 97266
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 1
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED

通过bbed修改i_obj4中的dataobj#值使之和obj$中对应值一致

验证确认ORA-8102被修复

SQL> shutdown abort
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 1570009088 bytes
Fixed Size                  2253584 bytes
Variable Size             469765360 bytes
Database Buffers         1090519040 bytes
Redo Buffers                7471104 bytes
Database mounted.
Database opened.
SQL>  create table t1 as select * from dual;
Table created.

通过使用bbed修改index值后,ORA-8102问题解决,可以执行创建表操作
姊妹篇见:通过bbed修改obj$中dataobj$重现I_OBJ4索引报ORA-08102错误

记录一次由于坏块和不恰当恢复引起各种ORA-600案例

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：记录一次由于坏块和不恰当恢复引起各种ORA-600案例

朋友让我帮忙处理一个不能open的库,打开alert日志一看,傻眼了,里面是各种ORA-600的错误应有尽有,被折腾的够惨
故障后重启,无法启动主要表现在block坏块,引起的各种ORA-600等错误

Mon Mar 02 16:09:27 2015
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
 parallel recovery started with 23 processes
Started redo scan
Completed redo scan
 read 962 KB redo, 256 data blocks need recovery
Started redo application at
 Thread 1: logseq 726, block 37343
Recovery of Online Redo Log: Thread 1 Group 3 Seq 726 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/oa/redo03.log
Mon Mar 02 16:09:27 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 1673 OF FILE 3
Completed redo application of 0.27MB
Mon Mar 02 16:09:27 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 3104 OF FILE 3
Mon Mar 02 16:09:27 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 3613 OF FILE 3
Mon Mar 02 16:09:28 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 272 OF FILE 3
Mon Mar 02 16:09:28 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 2512 OF FILE 3
Hex dump of (file 2, block 92889) in trace file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_dbw2_4158.trc
Corrupt block relative dba: 0x00816ad9 (file 2, block 92889)
Bad header found during preparing block for write
Data in bad block:
 type: 0 format: 0 rdba: 0x6ad90000
 last change scn: 0x0000.00c6a052 seq: 0x1 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0x5d7e
 consistency value in tail: 0xa0520001
 check value in block header: 0x0
 block checksum disabled
Mon Mar 02 16:09:28 2015
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p007_4196.trc  (incident=3833):
ORA-00600: internal error code, arguments: [4502], [1], [], [], [], [], [], [], [], [], [], []
Mon Mar 02 16:09:28 2015
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p013_4208.trc  (incident=3881):
ORA-00600: internal error code, arguments: [2037], [4259067], [4244307968], [159], [243], [0], [2162032704], [100728832], [], [], [], []
Slave exiting with ORA-1172 exception
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p009_4200.trc:
ORA-01172: recovery of thread 1 stuck at block 3613 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p001_4184.trc:
ORA-01172: recovery of thread 1 stuck at block 2512 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p021_4224.trc:
ORA-10388: parallel query server interrupt (failure)
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p021_4224.trc:
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_dbw2_4158.trc  (incident=3697):
ORA-00600: internal error code, arguments: [kcbzpbuf_1], [4], [1], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oa/oa/incident/incdir_3697/oa_dbw2_4158_i3697.trc
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0xD2DDB7, kcbs_shrink_pool()+705] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_mman_4152.trc  (incident=3673):
ORA-07445: exception encountered: core dump [kcbs_shrink_pool()+705] [SIGSEGV] [ADDR:0x0] [PC:0xD2DDB7] [SI_KERNEL(general_protection)] []
Incident details in: /u01/app/oracle/diag/rdbms/oa/oa/incident/incdir_3673/oa_mman_4152_i3673.trc
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_dbw2_4158.trc:
Mon Mar 02 16:09:34 2015
Instance terminated by DBW2, pid = 4158

第二次重启后增加新错误ORA-00600[17182]

Mon Mar 02 16:39:50 2015
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p002_4321.trc  (incident=4993):
ORA-00600: internal error code, arguments: [17182], [0x7F548C2BDBA8], [], [], [], [], [], [], [], [], [], []

进行了一些恢复处理后,日志中报错
主要体现在进行了不完全恢复,而且应该是对redo进行了重命名或者redo头损坏锁引起的一系列提示

Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
 read 962 KB redo, 256 data blocks need recovery
Started redo application at
 Thread 1: logseq 726, block 37343
Recovery of Online Redo Log: Thread 1 Group 3 Seq 726 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/oa/redo03.log
RECOVERY OF THREAD 1 STUCK AT BLOCK 1673 OF FILE 3
Aborting crash recovery due to error 1172
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6644.trc:
ORA-01172: recovery of thread 1 stuck at block 1673 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6644.trc:
ORA-01172: recovery of thread 1 stuck at block 1673 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
ORA-1172 signalled during: alter  database open...
Tue Mar 03 11:17:59 2015
Sweep [inc][17178]: completed
Sweep [inc][17177]: completed
Sweep [inc2][17178]: completed
Tue Mar 03 11:18:00 2015
ALTER DATABASE RECOVER  database until cancel
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 24 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Tue Mar 03 11:18:06 2015
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_pr00_6701.trc:
ORA-00266: name of archived log file needed
ORA-266 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER CANCEL
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_pr00_6701.trc:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/app/oracle/oradata/oa/system01.dbf'
Slave exiting with ORA-1547 exception
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_pr00_6701.trc:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/app/oracle/oradata/oa/system01.dbf'
ORA-10879 signalled during: ALTER DATABASE RECOVER CANCEL ...
Tue Mar 03 11:18:06 2015
Checker run found 4 new persistent data failures
Tue Mar 03 11:18:13 2015
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 12986989
Resetting resetlogs activation ID 3278679642 (0xc36cae5a)
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6644.trc:
ORA-00367: checksum error in log file header
ORA-00322: log 1 of thread 1 is not current copy
ORA-00312: online log 1 thread 1: '/u01/app/oracle/oradata/oa/redo01.log'
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6644.trc:

再一步折腾,增加了_allow_resetlogs_corruption= TRUE之后数据库报ORA-600[2662]

Tue Mar 03 11:19:26 2015
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6864.trc  (incident=18195):
ORA-00600: internal error code, arguments: [2662], [0], [13007002], [0], [13016626], [4194545], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oa/oa/incident/incdir_18195/oa_ora_6864_i18195.trc
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6864.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [2662], [0], [13007002], [0], [13016626], [4194545], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 6864): terminating the instance due to error 704
Instance terminated by USER, pid = 6864
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (6864) as a result of ORA-1092
Tue Mar 03 11:19:29 2015
ORA-1092 : opitsk aborting process

进一步折腾,可以看出来undo已经被其offline,无法正常访问,导致系统报ORA-704和ORA-00376

Wed Mar 04 21:10:58 2015
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17074.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/app/oracle/oradata/oa/undotbs01.dbf'
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17074.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/app/oracle/oradata/oa/undotbs01.dbf'
Error 704 happened during db open, shutting down database
USER (ospid: 17074): terminating the instance due to error 704
Instance terminated by USER, pid = 17074
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (17074) as a result of ORA-1092
Wed Mar 04 21:11:00 2015
ORA-1092 : opitsk aborting process

通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)检测结果见附件(xifenfei_db_recover_20150304),这里可以知道undo 不知道怎么折腾的数据文件scn较大而且还offline,
通过一些列方法(bbed,隐含参数等)调整数据库scn,强制启动数据库,报如下错误

Wed Mar 04 22:50:23 2015
SMON: enabling cache recovery
ORA-01555 caused by SQL statement below (SQL ID: 3nkd3g3ju5ph1, SCN: 0x0000.4000003e):
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17807.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17807.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 17807): terminating the instance due to error 704
Instance terminated by USER, pid = 17807
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (17807) as a result of ORA-1092

根据经验,该错误怀疑是文件头scn不够大,块延迟清理导致,进一步增加scn尝试,最后依旧是ORA-00704/ORA-00604/ORA-01555错误

Wed Mar 04 22:50:23 2015
SMON: enabling cache recovery
ORA-01555 caused by SQL statement below (SQL ID: 3nkd3g3ju5ph1, SCN: 0x0000.4000003e):
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17807.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17807.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 17807): terminating the instance due to error 704
Instance terminated by USER, pid = 17807
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (17807) as a result of ORA-1092

根据经验,在scn上做手脚估计难以解决给问题,对其启动过程做10046和errorstack分析发现

PARSING IN CURSOR #3 len=202 dep=2 uid=0 oct=3 lid=0 tim=1425481940448439 hv=3819099649 ad='64ff91af8' sqlid='3nkd3g3ju5ph1'
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
END OF STMT
PARSE #3:c=1000,e=334,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=4,plh=0,tim=1425481940448439
BINDS #3:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f5b3253a6f0  bln=22  avl=01  flg=05
  value=0
 Bind#1
  oacdty=01 mxl=32(06) mxlc=00 mal=00 scl=00 pre=00
  oacflg=18 fl2=0001 frm=01 csi=852 siz=32 off=0
  kxsbbbfp=7f5b3253a6b8  bln=32  avl=06  flg=05
  value="PROPS$"
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f5b3253a688  bln=24  avl=02  flg=05
  value=1
EXEC #3:c=0,e=640,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=4,plh=2853959010,tim=1425481940449147
WAIT #3: nam='db file sequential read' ela= 5 file#=1 block#=345 blocks=1 obj#=37 tim=1425481940449186
WAIT #3: nam='db file sequential read' ela= 4 file#=1 block#=44528 blocks=1 obj#=37 tim=1425481940449221
WAIT #3: nam='db file sequential read' ela= 3 file#=1 block#=5505 blocks=1 obj#=37 tim=1425481940449247
*** 2015-03-04 23:12:20.450
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0)
----- Error Stack Dump -----
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
----- Current SQL Statement for this session (sql_id=g64r07v2jn8nq) -----
SELECT NULL FROM PROPS$ WHERE NAME='BOOTSTRAP_UPGRADE_ERROR'

这里可以发现是数据库在启动的过程中需要执行SELECT NULL FROM PROPS$ WHERE NAME=’BOOTSTRAP_UPGRADE_ERROR’语句，而该语句递归调用了select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null 语句。既然这样通过一些方法避免数据库启动之时查询SELECT NULL FROM PROPS$ WHERE NAME=’BOOTSTRAP_UPGRADE_ERROR’语句，果然数据库启动成功。

知识点补充
ORA-600 [4502] [a]

Arg [a] ITL entry with a lock count
Meaning: During ITL cleanout we clear all row locks but the ITL entry
	 still thinks there is an uncleared lock. Ie: ITL has a locked
	 row but there are no locked rows in the block

大体意思是数据库发现undo 的itl已经被清除，但是block中的itl依然存在，从而出现ORA-600[4502],引起该问题除bug外主要原因是坏块

ORA-600 [2037] [a] [b] {c} [d] [e] [f] [g]

Arg [a] Relative Data Block Address (RDBA) that the redo vector is for
Arg [b] The Block format
Arg {c} RDBA in the block itself
Arg [d] The block type
Arg [e] The sequence number
Arg [f] Flags, if set
Arg [g] The return value from the block head/tail checker.
DESCRIPTION:
  During recovery we are examining a block to ensure that it is not
  corrupt prior to applying any change vectors.
  The block has failed this check and this exception is raised

大体意思是在恢复过程中，正在检查的块，以确保它在应用任何变化向量之前不损坏。如果检查失败排除该异常ORA-600[2037],引起该问题除bug外主要原因是坏块

ORA-600 [kcbzpbuf_1],[a],[b]

Arg [a] Corruption reason
Arg [b] Calculate checksum flag
Corruption reason:
#define KCBH_GOOD    0                                     /* block is valid */
#define KCBH_ZERO    1             /* block header was entirely zero on disk */
#define KCBH_BROKEN  2      /* corruption could be from a partial disk write */
#define KCBH_CHKVAL  3               /* The check value for the block failed */
#define KCBH_CORRUPT 4     /* this is the wrong block or is not a data block */
#define KCBH_ZERONG  5               /* all zero block and it is not allowed */
Calculate checksum flag:
The possible values are 1 (Generate Checksum - db_block_checksum is enabled - default value)
                        0 (do not generate checksum - db_block_checksum=false)

kcbzpbuf_1是该错误的源码函数

ORA-600 [17182] [a] [b] {c} [d] [e]

DESCRIPTION:
  Oracle has detected that the magic number in a memory chunk header has been overwritten.
  This is a heap (in memory) corruption and there is no underlying data corruption.
  The error may occur in the one of the process specific heaps
  (the Call heap, PGA heap, or session heap) or in the shared heap (SGA).

ORACLE 发现在内存中重要的块头被重新,但是没有基础数据损坏,大部分和数据块或者内存损坏有关系.

ORA-600 [4552] [a] [b] {c} [d] [e]

DESCRIPTION:
  This assertion is raised because we are trying to unlock the rows in a
  block, but receive an incorrect block type.
  The second argument is the block type received.

ORACLE尝试对某行进行解锁但是接收到了不正确的数据块类型,Arg [b]是接收到的数据块类型

ORA-600 [2662] [a] [b] {c} [d] [e]

DESCRIPTION:
  A data block SCN is ahead of the current SCN.
  The ORA-600 [2662] occurs when an SCN is compared to the dependent SCN
  stored in a UGA variable.
  If the SCN is less than the dependent SCN then we signal the ORA-600 [2662]
  internal error.
ARGUMENTS:
  Arg [a]  Current SCN WRAP
  Arg [b]  Current SCN BASE
  Arg {c}  dependent SCN WRAP
  Arg [d]  dependent SCN BASE
  Arg [e]  Where present this is the DBA where the dependent SCN came from.

主要的含义就是oracle文件头scn比某个block dependent scn小从而出现该问题

11.1.0.7版本也会出现access$表丢失导致数据库无法启动

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：11.1.0.7版本也会出现access$表丢失导致数据库无法启动

有网友咨询数据库启动报ora-01092：ORACLE 实例终止。强制断开连接，请求帮忙处理
数据库版本

Trace file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_5648.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Windows NT Version V6.1 Service Pack 1
CPU                 : 1 - type 8664, 1 Physical Cores
Process Affinity    : 0x0000000000000000
Memory (Avail/Total): Ph:7605M/10239M, Ph+PgF:11979M/20477M
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 18
Windows thread id: 5648, image: ORACLE.EXE (SHAD)

open数据库报ORA-01092: ORACLE 实例终止。强制断开连接

SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-01092: ORACLE 实例终止。强制断开连接

alert日志

Thread 1 opened at log sequence 1008
  Current log# 3 seq# 1008 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_3964.trc:
ORA-00704: 引导程序进程失败
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-00942: 表或视图不存在
Error 704 happened during db open, shutting down database
USER (ospid: 3964): terminating the instance due to error 704
Instance terminated by USER, pid = 3964
ORA-1092 signalled during: ALTER DATABASE OPEN...
ORA-1092 : opiodr aborting process unknown ospid (3384_3964)

做10046分析日志

PARSE ERROR #1:len=56 dep=1 uid=0 oct=3 lid=0 tim=1796038335 err=942
select order#,columns,types from access$ where d_obj#=:1
*** 2015-01-27 21:24:50.794
----- Error Stack Dump -----
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-00942: 表或视图不存在

通过这里可以知道数据库在启动的过程中由于无法访问access$表从而出现ORA-00942错误,又是由于该sql是数据库内部调用因为出现ORA-00604错误.
出现该错误的原因是由于:BUG:12733463 – ORA-704, ORA-604 AND ORA-942 ON TABLE ACCESS$ DURING STARTUP
官方提供方法

1. Shutdown (abort) the instance and clean up any OS structures used by the instance.
    Eg: Ensure there is no shared memory, semaphores etc.. left lying around
2. Retry the startup.
3. If the error persists try and recover the database or recover from a backup.

惜分飞处理方法

startup  upgrade
 create table access$
 ( d_obj#        number not null,
   order#        number not null,
   columns       raw(126),
   types         number not null)
   storage (initial 10k next 100k maxextents unlimited pctincrease 0)
/
create index i_access1 on
  access$(d_obj#, order#)
  storage (initial 10k next 100k maxextents unlimited pctincrease 0)
/

以前类似文章:Oracle 异常恢复案例汇总

主机断电系统回到N年前数据库报ORA-600 kcm_headroom_warn_1错误

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：主机断电系统回到N年前数据库报ORA-600 kcm_headroom_warn_1错误

主机断电导致系统时间回退到14年前,数据库启动报ORA-600[kcm_headroom_warn_1]错误

Sat Jun 21 17:49:12 2014   ---正常系统时间
Instance shutdown complete
Mon Aug 07 06:13:28 2000   ---重启后系统时间
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Initial number of CPU is 64
Effective number of CPU for internal database sizing is 32
Number of processor cores in the system is 8
Number of processor sockets in the system is 1
CELL communication is configured to use 0 interface(s):
CELL IP affinity details:
    NUMA status: non-NUMA system
    cellaffinity.ora status: N/A
CELL communication will use 1 IP group(s):
    Grp 0:
Picked latch-free SCN scheme 3
Autotune of undo retention is turned on.
IMODE=BR
ILAT =264
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
ORACLE_HOME = /ora1/prod/db/tech_st/11.2.0
System name:	SunOS
Node name:	erpdb1-boot
Release:	5.10
Version:	Generic_147147-26
Machine:	sun4v
Using parameter settings in server-side spfile /ora1/prod/db/tech_st/11.2.0/dbs/spfileprod.ora
System parameters with non-default values:
  processes                = 1200
  sessions                 = 2400
  timed_statistics         = TRUE
  event                    = ""
  shared_pool_size         = 448M
  shared_pool_reserved_size= 322122547
  nls_language             = "american"
  nls_territory            = "america"
  nls_sort                 = "binary"
  nls_date_format          = "DD-MON-RR"
  nls_numeric_characters   = ".,"
  nls_comp                 = "binary"
  nls_length_semantics     = "BYTE"
  sga_target               = 10G
  control_files            = "/data1/prod/db/apps_st/data/cntrl01.dbf"
  control_files            = "/data1/prod/db/apps_st/data/cntrl02.dbf"
  control_files            = "/data1/prod/db/apps_st/data/cntrl03.dbf"
  db_block_checksum        = "TRUE"
  db_block_size            = 8192
  compatible               = "11.1.0"
  log_archive_dest_1       = "location=/arch1/prod/arch"
  log_archive_format       = "prod_%t_%s_%r.arc"
  log_buffer               = 10485760
  log_checkpoint_interval  = 100000
  log_checkpoint_timeout   = 1200
  db_files                 = 512
  log_checkpoints_to_alert = TRUE
  dml_locks                = 10000
  undo_management          = "AUTO"
  undo_tablespace          = "APPS_UNDOTS1"
  db_block_checking        = "FALSE"
  _disable_fast_validate   = TRUE
  sec_case_sensitive_logon = FALSE
  session_cached_cursors   = 500
  utl_file_dir             = "/usr/tmp"
  plsql_code_type          = "INTERPRETED"
  plsql_optimize_level     = 2
  job_queue_processes      = 10
  _system_trig_enabled     = TRUE
  cursor_sharing           = "EXACT"
  parallel_min_servers     = 0
  parallel_max_servers     = 8
  audit_file_dest          = "/ora1/prod/db/tech_st/admin/prod/adump"
  db_name                  = "prod"
  open_cursors             = 3600
  _sort_elimination_cost_ratio= 5
  _b_tree_bitmap_plans     = FALSE
  _fast_full_scan_enabled  = FALSE
  query_rewrite_enabled    = "true"
  _like_with_bind_as_equality= TRUE
  pga_aggregate_target     = 2G
  workarea_size_policy     = "AUTO"
  _optimizer_autostats_job = FALSE
  optimizer_secure_view_merging= FALSE
  aq_tm_processes          = 4
  olap_page_pool_size      = 4M
  diagnostic_dest          = "/ora1/prod/db/tech_st/11.2.0/admin/prod_erpdb1"
  _trace_files_public      = TRUE
  max_dump_file_size       = "20480"
Mon Aug 07 06:13:30 2000
PMON started with pid=2, OS id=3556
Mon Aug 07 06:13:30 2000
PSP0 started with pid=3, OS id=3557
Mon Aug 07 06:13:31 2000
VKTM started with pid=4, OS id=3558 at elevated priority
VKTM running at (10)millisec precision with DBRM quantum (100)ms
Mon Aug 07 06:13:31 2000
GEN0 started with pid=5, OS id=3562
Mon Aug 07 06:13:32 2000
DIAG started with pid=6, OS id=3564
Mon Aug 07 06:13:32 2000
DBRM started with pid=7, OS id=3565
Mon Aug 07 06:13:32 2000
DIA0 started with pid=8, OS id=3566
Mon Aug 07 06:13:32 2000
MMAN started with pid=9, OS id=3567
Mon Aug 07 06:13:32 2000
DBW0 started with pid=10, OS id=3568
Mon Aug 07 06:13:32 2000
DBW1 started with pid=11, OS id=3569
Mon Aug 07 06:13:32 2000
DBW2 started with pid=12, OS id=3570
Mon Aug 07 06:13:32 2000
DBW3 started with pid=13, OS id=3571
Mon Aug 07 06:13:32 2000
LGWR started with pid=14, OS id=3572 at elevated priority
Mon Aug 07 06:13:32 2000
CKPT started with pid=15, OS id=3575
Mon Aug 07 06:13:32 2000
SMON started with pid=16, OS id=3576
Mon Aug 07 06:13:32 2000
RECO started with pid=17, OS id=3577
Mon Aug 07 06:13:32 2000
MMON started with pid=18, OS id=3578
Mon Aug 07 06:13:32 2000
MMNL started with pid=19, OS id=3579
ORACLE_BASE not set in environment. It is recommended
that ORACLE_BASE be set in the environment
Reusing ORACLE_BASE from an earlier startup = /ora1/prod/db/tech_st
Mon Aug 07 06:13:32 2000
ALTER DATABASE   MOUNT
Successful mount of redo thread 1, with mount id 4111810188
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Mon Aug 07 06:13:36 2000
ALTER DATABASE OPEN
************************************************************
Warning: The SCN headroom for this database is only -51464 hours!
************************************************************
Errors in file /ora1/prod/db/tech_st/11.2.0/admin/prod_erpdb1/diag/rdbms/prod/prod/trace/prod_ora_3583.trc  (incident=441878):
ORA-00600: internal error code, arguments: [kcm_headroom_warn_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /ora1/prod/db/tech_st/11.2.0/admin/prod_erpdb1/diag/rdbms/prod/prod/incident/incdir_441878/prod_ora_3583_i441878.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /ora1/prod/db/tech_st/11.2.0/admin/prod_erpdb1/diag/rdbms/prod/prod/trace/prod_ora_3583.trc:
ORA-00600: internal error code, arguments: [kcm_headroom_warn_1], [], [], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...
Dumping diagnostic data in directory=[cdmp_20000807061339], requested by (instance=1, osid=3583), summary=[incident=441878].
Mon Aug 07 06:14:35 2000
Sweep [inc][441878]: completed
Sweep [inc2][441878]: completed

在数据库出现莫名其妙问题不能启动之时,请注意主机时间,另外还有一例:记录一次ORA-00600[2252]故障解决

分类目录归档：Oracle备份恢复