ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户虚拟化环境,由于断电,启动数据库报ORA-01157错误,通过操作系统层面查看,发现文件是存在的,但是dbv检测报不可访问
ora-01157


感觉是文件系统损坏了,尝试把该文件拷贝到其他磁盘
221509

查看操作系统事件,确认是ntfs文件系统的MFT损坏
mft

基于这种情况,通过文件系统恢复工具进行恢复该文件尝试,提示恢复文件大小和实际元数据中记录大小不一致
214712

通过对比实际恢复大小和文件本身大小,发现7811899392-7791460352,几乎等于20M大小(也就是说恢复出来的数据文件少了20M),通过分析数据库alert日志,确认该系统在前端时间刚好扩展了20M(增加数据文件之时指定了每次扩展20m)

2023-08-11T11:29:21.397236+08:00
ALTER TABLESPACE "HSHIS" ADD DATAFILE
'D:\APP\ADMINISTRATOR\ORADATA\HIS\HSHIS01.DBF' SIZE 10M AUTOEXTEND ON NEXT 20M MAXSIZE 8001M
Completed: ALTER TABLESPACE "HSHIS" ADD DATAFILE
'D:\APP\ADMINISTRATOR\ORADATA\HIS\HSHIS01.DBF' SIZE 10M AUTOEXTEND ON NEXT 20M MAXSIZE 8001M

2024-10-09T00:18:31.058537+08:00
Resize operation completed for file# 66, old size 7608320K, new size 7628800K

通过对该文件底层block分析,确认最终丢失block就是最后20M(直接的数据文件的block的rdba均正确),对于这种故障,通过填补数据文件尾部,欺骗数据库完成该文件的恢复(最后20M中如果写入了业务数据,可能会丢失),做好该文件修复工作之后,尝试打开数据库,结果很不乐观,redo也损坏
recover-error


屏蔽一致性,强制打开库成功

2024-10-18T04:24:43.911107+08:00
ALTER DATABASE RECOVER    CANCEL  
2024-10-18T04:24:47.098637+08:00
Errors in file E:\TRACE\diag\rdbms\his\his\trace\his_pr00_2608.trc:
ORA-01547: 警告: RECOVER 成功但 OPEN RESETLOGS 将出现如下错误
ORA-01194: 文件 1 需要更多的恢复来保持一致性
ORA-01110: 数据文件 1: 'E:\ORADATA\SYSTEM01.DBF'
2024-10-18T04:24:47.114278+08:00
ORA-1547 signalled during: ALTER DATABASE RECOVER    CANCEL  ...
ALTER DATABASE RECOVER CANCEL 
ORA-1112 signalled during: ALTER DATABASE RECOVER CANCEL ...
2024-10-18T04:25:03.989398+08:00
alter database open resetlogs
2024-10-18T04:25:05.598781+08:00
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 2666786639 time 
Resetting resetlogs activation ID 3659241623 (0xda1b9897)
2024-10-18T04:25:12.380089+08:00
Setting recovery target incarnation to 3
2024-10-18T04:25:15.052071+08:00
Ping without log force is disabled:
  instance mounted in exclusive mode.
Endian type of dictionary set to little
2024-10-18T04:25:15.458286+08:00
Assigning activation ID 3703362676 (0xdcbcd474)
2024-10-18T04:25:15.505102+08:00
TT00 (PID:4092): Gap Manager starting
2024-10-18T04:25:15.551992+08:00
Redo log for group 1, sequence 1 is not located on DAX storage
2024-10-18T04:25:17.833250+08:00
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: E:\ORADATA\REDO01.LOG
Successful open of redo thread 1
2024-10-18T04:25:17.848888+08:00
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
stopping change tracking
2024-10-18T04:25:22.052035+08:00
Undo initialization recovery: err:0 start: 24275578 end: 24276578 diff: 1000 ms (1.0 seconds)
Undo initialization online undo segments: err:0 start: 24276578 end: 24276593 diff: 15 ms (0.0 seconds)
Undo initialization finished serial:0 start:24275578 end:24276640 diff:1062 ms (1.1 seconds)
Dictionary check beginning
Dictionary check complete
Verifying minimum file header compatibility for tablespace encryption..
Verifying file header compatibility for tablespace encryption completed for pdb 0
2024-10-18T04:25:23.114610+08:00
Database Characterset is AL32UTF8
No Resource Manager plan active
2024-10-18T04:25:29.036475+08:00
replication_dependency_tracking turned off (no async multimaster replication found)
2024-10-18T04:25:32.833386+08:00
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Starting background process AQPC
2024-10-18T04:25:33.145881+08:00
AQPC started with pid=37, OS id=5560 
2024-10-18T04:25:35.677167+08:00
Starting background process CJQ0
2024-10-18T04:25:35.708430+08:00
CJQ0 started with pid=39, OS id=2728 
2024-10-18T04:25:36.724036+08:00
Completed: alter database open resetlogs

然后导出数据到新库,其中遇到了file# 66号文件最后丢失的20M引起的数据无法正常导出的问题处理(丢弃损坏部分数据,把剩余好的表中数据恢复到新库中)