200T 数据库非归档无备份恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:200T 数据库非归档无备份恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一套近200T的,6个节点的RAC,由于存储管线链路不稳定,导致服务器经常性掉盘,引起asm 磁盘组频繁dismount/mount,数据库集群节点不停的重启,修复好链路问题之后,数据库启动报ORA-01113,ORA-01110
ORA-01113-ORA-01110


通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)脚本检测,发现有10个数据文件异常,无法正常恢复
20240814155122

该库比较大,有近200T,因此恢复需要各位谨慎(无法做现场备份,另外客户要求2天时间必须恢复好)
200t

由于数据库是非归档模式,该库无法通过应用归档日志来实现对这些文件进行恢复,对于这种情况,直接使用dbms_diskgroup把数据文件头拷贝到文件系统中,类似操作

SQL> @dbms_diskgroup_get_block.sql  +DATA/xifenfei.dbf 1 1 /tmp/xff/xifenfei.dbf.header

Parameter 1:
ASM_file_name (required)


Parameter 2:
block_to_extract (required)


Parameter 3
number_of_blocks_to_extract (required)


Parameter 4:
FileSystem_File_Name (required)

old  14:  v_AsmFilename := '&ASM_File_Name';
new  14:  v_AsmFilename := '+DATA/xifenfei.dbf';
old  15:  v_offstart := '&block_to_extract';
new  15:  v_offstart := '1';
old  16:  v_numblks := '&number_of_blocks_to_extract';
new  16:  v_numblks := '1';
old  17:  v_FsFilename := '&FileSystem_File_Name';
new  17:  v_FsFilename := '/tmp/xff/xifenfei.dbf.header';
File: +DATA/xifenfei.dbf
Type: 2 Data File
Size (in logical blocks): 3978880
Logical Block Size: 16384
Physical Block Size: 512

PL/SQL procedure successfully completed.

然后通过bbed修改相关scn

BBED> set filename 'xifenfei.dbf.header'
	FILENAME       	xifenfei.dbf.header

BBED> set blocksize 16384
	BLOCKSIZE      	16384

BBED> map
 File: xifenfei.dbf.header (0)
 Block: 1                                     Dba:0x00000000
------------------------------------------------------------
 Data File Header

 struct kcvfh, 860 bytes                    @0       

 ub4 tailchk                                @16380   


BBED> p kcvfh.kcvfhckp.kcvcpscn
struct kcvcpscn, 8 bytes                    @484     
   ub4 kscnbas                              @484      0xa8061324
   ub2 kscnwrp                              @488      0x0081

BBED> assign file 295 block 1 kcvfh.kcvfhckp.kcvcpscn = file 1 block 1 kcvfh.kcvfhckp.kcvcpscn;
struct kcvcpscn, 8 bytes                    @484     
   ub4 kscnbas                              @484      0xa8133e2b
   ub2 kscnwrp                              @488      0x0081

然后把修改的数据文件头写回到asm中

SQL> @dbms_diskgroup_cp_block_to_asm.sql  /tmp/xff/xifenfei.dbf.header  +DATA/xifenfei.dbf 1 1 

Parameter 1:
v_FsFileName (required)


Parameter 2:
v_AsmFileName (required)


Parameter 3
v_offstart (required)


Parameter 4
v_numblks (required)

old  16: v_FsFileName := '&v_FsFileName';
new  16: v_FsFileName := '/tmp/xff/xifenfei.dbf.header';
old  17: v_AsmFileName := '&v_AsmFileName';
new  17: v_AsmFileName := '+DATA/xifenfei.dbf';
old  18: v_offstart := '&v_offstart';
new  18: v_offstart := '1';
old  19:  v_numblks := '&v_numblks';
new  19:  v_numblks := '1';
File: +DATA/xifenfei.dbf
Type: 2 Data File
Size (in logical blocks): 3978880
Logical Block Size: 16384

PL/SQL procedure successfully completed.

查询文件头是否修改成功

[oracle@xff1 xff]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Sat Aug 10 16:45:02 2024

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> set numw 16
SQL> select CHECKPOINT_CHANGE# from v$datafile_header where file# in (1,295);

CHECKPOINT_CHANGE#
------------------
      556870614571
      556870614571

SQL> recover datafile 295;
Media recovery complete.

通过上述操作,确认bbed修改文件头成功,后续类似方法对其他9个文件进行修改,并打开数据库

SQL> recover database;
Media recovery complete.
SQL> alter database open;

Database altered.

alert日志提示

Sat Aug 10 16:46:11 2024
ALTER DATABASE RECOVER  datafile 295  
Media Recovery Start
Serial Media Recovery started
WARNING! Recovering data file 295 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
Media Recovery Complete (xff1)
Completed: ALTER DATABASE RECOVER  datafile 295  
Sat Aug 10 16:46:39 2024
ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Sat Aug 10 16:46:51 2024
WARNING! Recovering data file 1139 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 1140 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 1601 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 1803 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 1827 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 1931 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 2185 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 2473 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
WARNING! Recovering data file 2616 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
Sat Aug 10 16:46:54 2024
Parallel Media Recovery started with 64 slaves
Media Recovery Complete (xff1)
Completed: ALTER DATABASE RECOVER  database  
Sat Aug 10 17:19:58 2024
alter database open
This instance was first to open
Sat Aug 10 17:19:58 2024
SUCCESS: diskgroup DATA was mounted
Sat Aug 10 17:19:58 2024
NOTE: dependency between database xff and diskgroup resource ora.DATA.dg is established
Sat Aug 10 17:20:10 2024
Picked broadcast on commit scheme to generate SCNs
Sat Aug 10 17:20:10 2024
SUCCESS: diskgroup REDO was mounted
Sat Aug 10 17:20:10 2024
NOTE: dependency between database xff and diskgroup resource ora.REDO.dg is established
Thread 1 opened at log sequence 124958
  Current log# 14 seq# 124958 mem# 0: +REDO/xff/log2.ora
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Aug 10 17:20:14 2024
SMON: enabling cache recovery
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
[33770] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:261099864 end:261100854 diff:990 (9 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Sat Aug 10 17:20:16 2024
minact-scn: Inst 1 is now the master inc#:2 mmon proc-id:33650 status:0x7
minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000
Starting background process GTX0
Sat Aug 10 17:20:16 2024
GTX0 started with pid=45, OS id=34119 
Starting background process RCBG
Sat Aug 10 17:20:16 2024
RCBG started with pid=46, OS id=34121 
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Aug 10 17:20:16 2024
QMNC started with pid=47, OS id=34134 
Starting background process SMCO
Completed: alter database open

其他集群其他节点数据库,一切正常
20240814162201


检查数据字典一致性

SQL> @hcheck.sql
HCheck Version 07MAY18 on 10-AUG-2024 18:24:49
----------------------------------------------
Catalog Version 11.2.0.3.0 (1102000300)
db_name: XFF

				   Catalog	 Fixed
Procedure Name			   Version    Vs Release    Timestamp
Result
------------------------------ ... ---------- -- ---------- --------------
------
.- LobNotInObj		       ... 1102000300 <=  *All Rel* 08/10 18:24:49 PASS
.- MissingOIDOnObjCol	       ... 1102000300 <=  *All Rel* 08/10 18:24:49 PASS
.- SourceNotInObj	       ... 1102000300 <=  *All Rel* 08/10 18:24:49 PASS
.- OversizedFiles	       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- PoorDefaultStorage	       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- PoorStorage		       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- TabPartCountMismatch        ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- OrphanedTabComPart	       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- MissingSum$		       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- MissingDir$		       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- DuplicateDataobj	       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
.- ObjSynMissing	       ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
.- ObjSeqMissing	       ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
.- OrphanedUndo 	       ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
.- OrphanedIndex	       ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
.- OrphanedIndexPartition      ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
.- OrphanedIndexSubPartition   ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- OrphanedTable	       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- OrphanedTablePartition      ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- OrphanedTableSubPartition   ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- MissingPartCol	       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- OrphanedSeg$ 	       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- OrphanedIndPartObj#	       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- DuplicateBlockUse	       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- FetUet		       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- Uet0Check		       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- SeglessUET		       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- BadInd$		       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
.- BadTab$		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- BadIcolDepCnt	       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- ObjIndDobj		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- TrgAfterUpgrade	       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- ObjType0		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- BadOwner		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- StmtAuditOnCommit	       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- BadPublicObjects	       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- BadSegFreelist	       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- BadDepends		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- CheckDual		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- ObjectNames		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
.- BadCboHiLo		       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- ChkIotTs		       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- NoSegmentIndex	       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- BadNextObject	       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- DroppedROTS		       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- FilBlkZero		       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- DbmsSchemaCopy	       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- OrphanedObjError	       ... 1102000300 >  1102000000 08/10 18:24:54 PASS
.- ObjNotLob		       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
.- MaxControlfSeq	       ... 1102000300 <=  *All Rel* 08/10 18:24:55 PASS
.- SegNotInDeferredStg	       ... 1102000300 >  1102000000 08/10 18:25:18 PASS
.- SystemNotRfile1	       ... 1102000300 >   902000000 08/10 18:25:18 PASS
.- DictOwnNonDefaultSYSTEM     ... 1102000300 <=  *All Rel* 08/10 18:25:18 PASS
.- OrphanTrigger	       ... 1102000300 <=  *All Rel* 08/10 18:25:18 PASS
.- ObjNotTrigger	       ... 1102000300 <=  *All Rel* 08/10 18:25:18 PASS
---------------------------------------
10-AUG-2024 18:25:18  Elapsed: 29 secs
---------------------------------------
Found 0 potential problem(s) and 0 warning(s)

PL/SQL procedure successfully completed.

Statement processed.

Complete output is in trace file:
/u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_71148_HCHECK.trc

运气不错,数据字典本身没有损坏,业务直接运行,一切正常(主要原因是在光纤链路不稳定的情况下,客户已经没有往库中写入数据)

ORA-01113 ORA-01110错误不一定都要Oracle Recovery Tools解决

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-01113 ORA-01110错误不一定都要Oracle Recovery Tools解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户联系我,说数据库故障经过他们一系列恢复之后,现在open库报ORA-01113 ORA-01110错误,咨询我Oracle Recovery Tools恢复工具是否可以解决该问题
ORA-01113-ORA-01110


alert日志报错

Sat Dec 16 09:10:45 2023
alter database open
Errors in file f:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_8948.trc:
ORA-01113: 文件 1 需要介质恢复
ORA-01110: 数据文件 1: 'F:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'
ORA-1113 signalled during: alter database open...

其实这个错误不是这个库不能打开的本质,一般遇到这种错误的客户,都是强制拉过库,在拉库的过程中失败,才是导致库不能open的本质原因,比如通过查看相关日志发现拉库的时候报ORA-01555 ORA-00704错

Thu Dec 14 19:05:45 2023
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 13289960075
Resetting resetlogs activation ID 1596978603 (0x5f2ff5ab)
Thu Dec 14 19:05:45 2023
Setting recovery target incarnation to 2
Thu Dec 14 19:05:45 2023
Assigning activation ID 1683369006 (0x64562c2e)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: F:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Dec 14 19:05:45 2023
SMON: enabling cache recovery
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0003.1824b292):
select ctime, mtime, stime from obj$ where obj# = :1
Errors in file f:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_7736.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01555: 快照过旧: 回退段号 4 (名称为 "_SYSSMU4_1451910634$") 过小
Errors in file f:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_7736.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01555: 快照过旧: 回退段号 4 (名称为 "_SYSSMU4_1451910634$") 过小
Error 704 happened during db open, shutting down database
USER (ospid: 7736): terminating the instance due to error 704
Instance terminated by USER, pid = 7736
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (7736) as a result of ORA-1092
Thu Dec 14 19:05:51 2023
ORA-1092 : opitsk aborting process

比如还有客户咨询也是ORA-01113 ORA-01110错误,希望通过Oracle Recovery Tools工具来解决该问题,通过咨询客户确认他们其实是在前期恢复中报ORA-600 2662错误
ORA-600-2662


对于一般通过强制拉库启动过程中报ORA-600错误,后面恢复中报ORA-01113 ORA-01110错误无法正常open的库,一般不用Oracle Recovery Tools工具来解决,通过一些恢复技巧就可以解决该问题.如果无法自行解决,可以联系我们进行技术支持,最大限度抢救和数据,减少损失
电话/微信:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

一键恢复ORA-01113 ORA-01110—Oracle Recovery Tools

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:一键恢复ORA-01113 ORA-01110—Oracle Recovery Tools

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一般由于归档日志丢失或者非归档库可能在数据库启动的时候出现类似如下错误

SQL> startup
ORACLE instance started.

Total System Global Area 2137886720 bytes
Fixed Size                  2282960 bytes
Variable Size             587205168 bytes
Database Buffers         1543503872 bytes
Redo Buffers                4894720 bytes
Database mounted.
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: 'F:\ORADATA\XIFENFEI\SYSTEM01.DBF'

主要是由于数据文件不一致需要比较老的日志,但是日志不存在从而导致该问题
1


以前是通过一系列的方法强制open库,以前类似文章:
12c ORA-01113 ORA-01110 恢复
分享一次ORA-01113 ORA-01110故障处理过程

现在可以通过Oracle Recovery Tools工具一键解决
2
3


尝试open数据库

SQL> recover database;
Media recovery complete.
SQL> alter database open;

Database altered.