记录一次system表空间坏块(ORA-01578)数据库恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:记录一次system表空间坏块(ORA-01578)数据库恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

半夜朋友打来求救电话,说xx医院his系统因为存储异常导致system坏块无法正常启动,因为是win平台无法使用bbed,无法修复system 坏块,请求技术支持
dbv检查system文件报坏块
1


对应具体地址为:file 1 block 39041和66738

判断控制文件异常
通过数据库恢复检查脚本(Oracle Database Recovery Check)脚本检测数据库发现控制文件明显异常(checkpoint scn)
2
3


尝试恢复数据库
4


因此对该库进行了不完全恢复,然后尝试resetlogs打开数据库,数据库报ORA-600 2662错误

Fri Aug 29 02:35:08 2014
alter database open resetlogs
Fri Aug 29 02:35:11 2014
RESETLOGS after complete recovery through change 451371288
Resetting resetlogs activation ID 1232269761 (0x4972f1c1)
Fri Aug 29 02:35:15 2014
Setting recovery target incarnation to 3
Fri Aug 29 02:35:15 2014
Assigning activation ID 1384652231 (0x52881dc7)
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=17, OS id=1084
Fri Aug 29 02:35:15 2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=18, OS id=2836
Fri Aug 29 02:35:15 2014
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: Z:\ORACLE\PRODUCT\10.2.0\ORCL\REDO01.LOG
Successful open of redo thread 1
Fri Aug 29 02:35:15 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Aug 29 02:35:15 2014
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
Fri Aug 29 02:35:15 2014
ARC0: Becoming the heartbeat ARCH
Fri Aug 29 02:35:15 2014
SMON: enabling cache recovery
Fri Aug 29 02:35:16 2014
Errors in file d:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_4824.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [451371311], [0], [451374534], [8388977], [], []
Fri Aug 29 02:35:16 2014
Errors in file d:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_4824.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [451371311], [0], [451374534], [8388977], [], []
Fri Aug 29 02:35:16 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Fri Aug 29 02:35:17 2014
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_2928.trc:
ORA-00600: ??????, ??: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 4824
ORA-1092 signalled during: alter database open resetlogs...

ORA-600 2662 该错误解决思路很明显,推进scn,数据库报ORA-01578

Fri Aug 29 02:42:47 2014
SMON: enabling cache recovery
Fri Aug 29 02:42:47 2014
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
Dictionary check complete
Fri Aug 29 02:42:49 2014
SMON: enabling tx recovery
Fri Aug 29 02:42:49 2014
Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 16
replication_dependency_tracking turned off (no async multimaster replication found)
Fri Aug 29 02:42:50 2014
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_4804.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01578: ORACLE 数据块损坏 (文件号 1, 块号 39041)
ORA-01110: 数据文件 1: 'Z:\ORACLE\PRODUCT\10.2.0\ORCL\SYSTEM01.DBF'
Fri Aug 29 02:42:50 2014
LOGSTDBY: Validating controlfile with logical metadata
Fri Aug 29 02:42:51 2014
LOGSTDBY: Validation complete
ORA-604 signalled during: alter database open...

使用event跳过坏块,启动数据库成功

Fri Aug 29 02:48:59 2014
SMON: enabling cache recovery
Fri Aug 29 02:49:00 2014
Successfully onlined Undo Tablespace 1.
Fri Aug 29 02:49:00 2014
SMON: enabling tx recovery
Fri Aug 29 02:49:00 2014
Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 16
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=34, OS id=3096
Fri Aug 29 02:49:01 2014
db_recovery_file_dest_size of 4096 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Fri Aug 29 02:49:01 2014
Completed: alter database open

查询坏块对象
5
6


因为这些对象均不是核心对象,直接进行truncate然后插入老数据

后续还有大量错误修复

ORA-12012: error on auto execute of job 1
ORA-08102: index key not found, obj# 239, file 1, block 1674 (2)
ORA-00600: 内部错误代码, 参数: [kcbz_check_objd_typ], [0], [0], [1], [], [], [], []
ORA-00600: internal error code, arguments: [6749], [3], [12606796], [173], [], [], [], []
ORA-00600: 内部错误代码, 参数: [13013], [52898], [52895], [38288618], [44], [38288618], [17], []
ORA-00600: 内部错误代码, 参数: [13013], [5001], [52895], [38286476], [5], [38286476], [17], []

再次说明,很多时候数据库恢复不要看成多神秘,就是几个参数搞定,更加不要神化有坏块就bbed修复,当然非常极端,使用N中工具,N种尝试的也存在.做好备份重于一切

ORA-600[6749] 发生在 SYSMAN.MGMT_METRICS_RAW表

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600[6749] 发生在 SYSMAN.MGMT_METRICS_RAW表

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库alert日志长时间出现ORA-00600[6749]错误
日志报错如下

Fri Jun  1 12:01:30 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_396.trc:
ORA-00600: internal error code, arguments: [6749], [3], [12596882], [49], [], [], [], []
Fri Jun  1 12:01:34 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_396.trc:
Fri Jun  1 13:01:06 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_13226.trc:
ORA-00600: internal error code, arguments: [6749], [3], [12596882], [49], [], [], [], []
Fri Jun  1 13:01:10 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_13226.trc:
Fri Jun  1 14:01:46 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_26691.trc:
ORA-00600: internal error code, arguments: [6749], [3], [12596882], [49], [], [], [], []
Fri Jun  1 14:01:51 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_26691.trc:
Fri Jun  1 15:01:21 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_7119.trc:
ORA-00600: internal error code, arguments: [6749], [3], [12596882], [49], [], [], [], []

查看trace日志

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /opt/oracle/product/10/oraapp
System name:	Linux
Node name:	oracle2
Release:	2.6.18-92.el5
Version:	#1 SMP Tue Apr 29 13:16:15 EDT 2008
Machine:	x86_64
Instance name: oraapp
Redo thread mounted by this instance: 1
Oracle process number: 44
Unix process pid: 26691, image: oracle@oracle2 (J000)
*** ACTION NAME:(target 5) 2012-06-01 14:01:00.298
*** MODULE NAME:(Oracle Enterprise Manager.rollup) 2012-06-01 14:01:00.298
*** SERVICE NAME:(SYS$USERS) 2012-06-01 14:01:00.298
*** SESSION ID:(406.24103) 2012-06-01 14:01:00.298
Dumping current redo log in thread 1
DUMP OF REDO FROM FILE '/opt/oracle/oradata/oraapp/systable/redo03.log'
 Opcodes 11.*
 DBAs (file#, block#):
      (3, 13970)
 RBAs: 0x000000.00000000.0000 thru 0xffffffff.ffffffff.ffff
 SCNs: scn: 0x0000.00000000 thru scn: 0xffff.ffffffff
 Times: creation thru eternity
 FILE HEADER:
	Compatibility Vsn = 169869568=0xa200100
	Db ID=1462349529=0x5729aed9, Db Name='ORAAPP'
	Activation ID=1462334681=0x572974d9
	Control Seq=2614156=0x27e38c, File size=245760=0x3c000
	File Number=3, Blksiz=512, File Type=2 LOG
 descrip:"Thread 0001, Seq# 0000003963, SCN 0x0000129fc9df-0xffffffffffff"

猜测ORA-600[6749]部分参数

SQL> select  DBMS_UTILITY.data_block_address_file (12596882) "file#",
  2  DBMS_UTILITY.data_block_address_block (12596882) "block#"
  3  from dual;
     file#     block#
---------- ----------
         3      13970

ORA-600[6749][a][b]{c}
这里证明c表示rdba

根据dba查询对象

SQL> select * from dba_extents where 13970 between block_id and block_id + blocks and file_id=3;
OWNER      SEGMENT_NAME                     SEGMENT_TYPE
---------- ------------------------------- -------------------
SYSMAN        SYS_IOT_OVER_10448                 TABLE
SQL> select owner,iot_name from dba_tables where table_name = 'SYS_IOT_OVER_10448';
OWNER                          IOT_NAME
------------------------------ ------------------------------
SYSMAN                         MGMT_METRICS_RAW
SQL>  ANALYZE TABLE SYSMAN.MGMT_METRICS_RAW  VALIDATE STRUCTURE CASCADE;
Table analyzed.

按照常理ORA-00600[6749]错误是因为坏块或者表和索引数据不一致导致,通过ANALYZE可以检查出来.这里显示正常,那可能是其他原因导致,查询MOS果然发现是ORA-600 [6749] Occurring on SYSMAN.MGMT_METRICS_RAW [ID 467439.1]

解决方法

The following workaround may resolve the problem temporarily:
1. Ensure you have a good backup before proceeding.
2. Create a copy of the SYSMAN.MGMT_METRICS_RAW table:
SQL> create table SYSMAN.MGMT_METRICS_RAW_COPY
as select * from SYSMAN.MGMT_METRICS_RAW;
3. Truncate the table:
SQL> truncate table SYSMAN.MGMT_METRICS_RAW;
May need  to disable trigger: "sysman.raw_metrics_after_insert" before proceeding.
Re-enable after the insert.
4. Re-insert the rows:
SQL> insert into SYSMAN.MGMT_METRICS_RAW
select * from SYSMAN.MGMT_METRICS_RAW_COPY;
SQL> commit;
5. Drop the copy table:
SQL> drop table SYSMAN.MGMT_METRICS_RAW_COPY;