记录一次ORA-01200完美恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:记录一次ORA-01200完美恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户虚拟化平台断电,导致oracle其数据库启动ORA-01200错误

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01122: database file 1 failed verification check
ORA-01110: data file 1: '/oradata/orcl/system01.dbf'
ORA-01200: actual file size of 1122560 is smaller than correct size of 1131520 blocks

对应的alert日志如下

Thu Jan 11 11:36:48 2024
ALTER DATABASE   MOUNT
Successful mount of redo thread 1, with mount id 1685778896
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Thu Jan 11 11:36:52 2024
ALTER DATABASE OPEN
Read of datafile '/oradata/orcl/system01.dbf' (fno 1) header failed with ORA-01200
Rereading datafile 1 header failed with ORA-01200
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_10847.trc:
ORA-01122: database file 1 failed verification check
ORA-01110: data file 1: '/oradata/orcl/system01.dbf'
ORA-01200: actual file size of 1122560 is smaller than correct size of 1131520 blocks
ORA-1122 signalled during: ALTER DATABASE OPEN...
Thu Jan 11 11:36:53 2024
Checker run found 1 new persistent data failures
Thu Jan 11 11:41:55 2024
alter database open
Read of datafile '/oradata/orcl/system01.dbf' (fno 1) header failed with ORA-01200
Rereading datafile 1 header failed with ORA-01200
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_12550.trc:
ORA-01122: database file 1 failed verification check
ORA-01110: data file 1: '/oradata/orcl/system01.dbf'
ORA-01200: actual file size of 1122560 is smaller than correct size of 1131520 blocks
ORA-1122 signalled during: alter database open...

报错比较明显system01.dbf文件本来大小应该为1131521个block,但是实际上只有1122561个block,因此无法正常启动,通过修改数据文件欺骗数据库
20240112123849


然后对异常的system文件进行处理,把人工构造的部分除掉

SQL> alter database datafile 1 resize 8770M;

Database altered.

rman检测system文件正常
20240112124307


数据库恢复完成,数据完美恢复(0丢失,不用逻辑迁移),该库可以继续使用,以前有过类似case:
bbed处理ORA-01200故障
ORA-01122 ORA-01200故障处理
ORA-1200/ORA-1207数据库恢复

ORA-1200/ORA-1207数据库恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-1200/ORA-1207数据库恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

由于系统性能问题或者底层io问题,数据库alert日志报一下控制文件损坏错误然后crash掉

Mon Nov 13 08:06:44 2023
Thread 1 advanced to log sequence 12100 (LGWR switch)
  Current log# 1 seq# 12100 mem# 0: /u01/oracle/oradata/xifenfei/redo01.log
Mon Nov 13 09:23:59 2023
********************* ATTENTION: ******************** 
 The controlfile header block returned by the OS
 has a sequence number that is too old. 
 The controlfile might be corrupted.
 PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE 
 without following the steps below.
 RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE 
 TO THE DATABASE, if the controlfile is truly corrupted.
 In order to re-start the instance safely, 
 please do the following:
 (1) Save all copies of the controlfile for later 
     analysis and contact your OS vendor and Oracle support.
 (2) Mount the instance and issue: 
     ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
 (3) Unmount the instance. 
 (4) Use the script in the trace file to
     RE-CREATE THE CONTROLFILE and open the database. 
*****************************************************
USER (ospid: 17064): terminating the instance
Mon Nov 13 09:24:00 2023
System state dump requested by (instance=1, osid=17064), summary=[abnormal instance termination].

重启数据库报ORA-01122 ORA-01110 ORA-01207错误

Mon Nov 13 10:11:21 2023
ALTER DATABASE OPEN
Errors in file /u01/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_25824.trc:
ORA-01122: database file 1 failed verification check
ORA-01110: data file 1: '/u01/oracle/oradata/xifenfei/system01.dbf'
ORA-01207: file is more recent than control file - old control file
ORA-1122 signalled during: ALTER DATABASE OPEN...

处理好上述错误之后遭遇ORA-01122 ORA-01110 ORA-01200,类似文章:
bbed处理ORA-01200故障
ORA-01122 ORA-01200故障处理

Mon Nov 13 10:51:48 2023
alter database open
Read of datafile '/u01/oracle/oradata/xifenfei/sysaux01.dbf' (fno 2) header failed with ORA-01200
Rereading datafile 2 header failed with ORA-01200
Errors in file /u01/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_24148.trc:
ORA-01122: database file 2 failed verification check
ORA-01110: data file 2: '/u01/oracle/oradata/xifenfei/sysaux01.dbf'
ORA-01200: actual file size of 2860800 is smaller than correct size of 2867200 blocks
ORA-1122 signalled during: alter database open...

解决上述错误之后,尝试open库报ORA-00314 ORA-00312之类错误

Mon Nov 13 18:00:43 2023
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Started redo scan
Completed redo scan
 read 61894 KB redo, 589 data blocks need recovery
Started redo application at
 Thread 1: logseq 12100, block 112760
Recovery of Online Redo Log: Thread 1 Group 1 Seq 12100 Reading mem 0
  Mem# 0: /u01/oracle/oradata/xifenfei/redo01.log
Completed redo application of 1.20MB
Mon Nov 13 18:00:44 2023
Hex dump of (file 2, block 39078) in trace file /u01/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p011_27469.trc
Reading datafile '/u01/oracle/oradata/xifenfei/sysaux01.dbf' for corruption at rdba: 0x008098a6 (file 2, block 39078)
Reread (file 2, block 39078) found same corrupt data (logically corrupt)
RECOVERY OF THREAD 1 STUCK AT BLOCK 39078 OF FILE 2
Mon Nov 13 18:00:44 2023
Exception [type: SIGSEGV, Address not mapped to object][ADDR:0xC][PC:0x95FB838, kdxlin()+4946][flags: 0x0, count: 1]
Mon Nov 13 18:00:44 2023
Exception [type: SIGSEGV, Address not mapped to object][ADDR:0xC][PC:0x95FB4DE, kdxlin()+4088][flags: 0x0, count: 1]
Errors in file /u01/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p011_27469.trc:
ORA-00314: log 2 of thread 1, expected sequence# 12093 doesn't match 12085
ORA-00312: online log 2 thread 1: '/u01/oracle/oradata/xifenfei/redo02.log'
Errors in file /u01/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p011_27469.trc:
ORA-00314: log 3 of thread 1, expected sequence# 12096 doesn't match 12080
ORA-00312: online log 3 thread 1: '/u01/oracle/oradata/xifenfei/redo03.log'
ORA-00314: log 2 of thread 1, expected sequence# 12093 doesn't match 12085
ORA-00312: online log 2 thread 1: '/u01/oracle/oradata/xifenfei/redo02.log'

后面继续处理遇到类似这样错误

ALTER DATABASE RECOVER    CANCEL  
Errors in file /u01/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_pr00_31110.trc:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/xifenfei/system01.dbf'
ORA-1547 signalled during: ALTER DATABASE RECOVER    CANCEL  ...
ALTER DATABASE RECOVER CANCEL 
ORA-1112 signalled during: ALTER DATABASE RECOVER CANCEL ...
Mon Nov 13 19:06:28 2023
alter database open resetlogs
ORA-1194 signalled during: alter database open resetlogs...

最后确认其他数据文件均可recover 成功,只有file 2 无法正常recover

SQL> recover datafile 1;
Media recovery complete.
SQL> recover datafile 2;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [2], [950840], [9339448],
[], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 950840, file
offset is 3494313984 bytes)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '/u01/oracle/oradata/xifenfei/sysaux01.dbf'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
SQL> recover datafile 3;
Media recovery complete.
SQL> recover datafile 4;
Media recovery complete.
SQL> recover datafile 5;
Media recovery complete.
SQL> recover datafile 6;
Media recovery complete.
SQL> recover datafile 7;
Media recovery complete.

SQL> recover  datafile 2 allow 1 corruption;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [2], [2410240], [10798848],
[], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 2410240, file
offset is 2564816896 bytes)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '/u01/oracle/oradata/xifenfei/sysaux01.dbf'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'

通过bbed修改文件头,直接open数据库成功,并协助客户顺利导出数据
参考类似文章:
使用bbed修复损坏datafile header
使用bbed让rac中的sysaux数据文件online

硬件故障恢复出文件之后数据库故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:硬件故障恢复出文件之后数据库故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户那边硬件故障(raid损坏磁盘超过了极限,导致raid offline),通过硬件恢复出来数据文件,然后尝试自行恢复,我接手的时候大量数据文件resetlogs scn异常.
wrong_resetlogs


重建控制文件报错

WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_ora_5949.trc:
ORA-01189: file is from a different RESETLOGS than previous files
ORA-01110: data file 153: '/home/oracle/oracledata/orcl/sysaux02.dbf'
ORA-1503 signalled during: CREATE CONTROLFILE REUSE DATABASE "ORCL" NORESETLOGS  ARCHIVELOG

通过修改文件头然后重建控制文件,可以通过bbed,或者我的小工具Oracle Recovery Tools
bbed解决ORA-01190
Oracle Recovery Tools 解决ORA-01190 ORA-01248等故障
重建control遗漏数据文件,reseltogs报ORA-1555错误处理
然后继续重建ctl发现以下错误

WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_ora_34075.trc:
ORA-01200: actual file size of 2015415 is smaller than correct size of 2944000 blocks
ORA-01110: data file 178: '/home/oracle/oracledata/orcl/xifenfei20_10.dbf'
ORA-1503 signalled during: CREATE CONTROLFILE REUSE DATABASE "ORCL" NORESETLOGS  NOARCHIVELOG

通过对比发现是由于客户上传恢复文件异常导致
20230713002257


重新上传文件,然后修改文件头,该问题解决,重建ctl成功,提个醒:对于这种硬件恢复之后文件上次到服务器上进行恢复的,一定要确认上传文件和原文件一致,不然做无用功或者恢复效果差很多
尝试open数据库报ORA-600 2662错误

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [5], [1653389530], [5],
[1653496702], [12583040], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [2662], [5], [1653389529], [5],
[1653496702], [12583040], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [5], [1653389527], [5],
[1653496702], [12583040], [], [], [], [], [], []
Process ID: 4710
Session ID: 1847 Serial number: 3

这个错误比较简单,一般是scn问题,有过大量的处理经验案例:
使用bbed解决ORA-00600[2662]
硬件故障导致ORA-600 2662错误处理
Patch SCN工具快速解决ORA-600 2662问题
解决好该问题之后,数据库open成功,实现了最大限度抢救数据.

ORA-01122 ORA-01200故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-01122 ORA-01200故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

由于某种原因客户的数据库启动报ORA-01122 ORA-01200错误
ORA-01200


让客户把system01.dbf文件发给我进行分析,发现system01.dbf文件大于32G(在8k的blocksize库中,默认情况system01.dbf文件不会超过32G),这个明显异常
system01.dbf

检测坏块情况发现4096000之后的block全部为全0块
20230704165111

通过bbed分析文件头记录文件大小
20230704165343

通过bbed修改合适的值,并且把文件截取到适当大小,提供system文件给客户,直接启动库成功,实现数据库完美恢复
20230704165533

通过设置文件头大小和截断合适大小实现本次数据库恢复,以前有过类似恢复:
bbed处理ORA-01200故障

bbed处理ORA-01200故障

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:bbed处理ORA-01200故障

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个朋友的测试库出现ORA-01200错误,正好周末比较空闲,随手帮他使用bbed进行了恢复,给广大朋友提供一种解决该问题的方法
数据库启动报错

C:\Users\Administrator>sqlplus /nolog
SQL*Plus: Release 11.1.0.6.0 - Production on 星期日 5月 12 22:09:11 2013
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
SQL> connect/as sysdba
已连接到空闲例程。
SQL> startup force
ORACLE 例程已经启动。
Total System Global Area 1071333376 bytes
Fixed Size                  1334380 bytes
Variable Size             318768020 bytes
Database Buffers          746586112 bytes
Redo Buffers                4644864 bytes
数据库装载完毕。
ORA-01122: 数据库文件 1 验证失败
ORA-01110: 数据文件 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'
ORA-01200: 87946 的实际文件大小小于 88320 块的正确大小

这里的错误很明显是因为file 1的数据文件头记录block大小为88320个block,而该数据文件的实际大小只有87946个block,所以出现该问题.

dbv检测文件

D:\app\Administrator\oradata\orcl>dbv file=SYSTEM01.DBF
DBVERIFY: Release 11.1.0.6.0 - Production on 星期日 5月 12 22:30:29 2013
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
DBVERIFY - 开始验证: FILE = SYSTEM01.DBF
DBVERIFY - 验证完成
检查的页总数: 87040
处理的页总数 (数据): 62870
失败的页总数 (数据): 0
处理的页总数 (索引): 11055
失败的页总数 (索引): 0
处理的页总数 (其它): 2437
处理的总页数 (段)  : 0
失败的总页数 (段)  : 0
空的页总数: 10678
标记为损坏的总页数: 0
流入的页总数: 0
加密的总页数        : 0
最高块 SCN            : 980055 (0.980055)

检查发现该数据文件未发现坏块,减小了该数据文件通过bbed恢复异常的风险,数据库最怕就是system中出现很多坏块

使用bbed修改kccfhfsz
因为win的bbed问题,所以拷贝到我的电脑上进行修改

C:\Users\XIFENFEI\Desktop\temp>bbed filename=system01.dbf blocksize=8192
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Sun May 12 23:27:26 2013
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set block 2
        BLOCK#          2
--从一台机器中拷贝到另外的机器,实际中的block可能发生改变,因为含block 0
BBED> map
 File: system01.dbf (0)
 Block: 2                                     Dba:0x00000000
------------------------------------------------------------
 Data File Header
 struct kcvfh, 360 bytes                    @0
 ub4 tailchk                                @8188
BBED> p kcvfhhdr.kccfhfsz
ub4 kccfhfsz                                @44       0x0001578a
--通过ORA-01200错误报出来的文件头记录大小88320实际就是0x0001578a
BBED> set mode edit
        MODE            Edit
BBED> set count 32
        COUNT           32
BBED> d
 File: system01.dbf (0)
 Block: 2                Offsets:   44 to   75           Dba:0x00000000
------------------------------------------------------------------------
00590100 00200000 01000300 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> m /x 8A570100
 File: system01.dbf (0)
 Block: 2                Offsets:   44 to   75           Dba:0x00000000
------------------------------------------------------------------------
 8a570100 00200000 01000300 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
--通过ORA-01200错误报出来的数据文件实际大小,来修改该文件头的kcvfhhdr.kccfhfsz值,也可以通过文件实际大小计算出来
BBED> p kcvfhhdr.kccfhfsz
ub4 kccfhfsz                                @44       0x0001578a
BBED> sum apply
Check value for File 0, Block 2:
current = 0x0f79, required = 0x0f79
BBED> verify
DBVERIFY - Verification starting
FILE = system01.dbf
BLOCK = 1
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0

打开数据库

SQL> select open_mode from v$database;
OPEN_MODE
----------
MOUNTED
SQL> alter database open;
数据库已更改。