ORA-00600: internal error code, arguments: [16513], [1403] 恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-00600: internal error code, arguments: [16513], [1403] 恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到客户请求,存储异常断电之后,一个40多T的经分数据库无法正常启动,通过各方一系列操作之后,数据库依旧无法open,报错信息为:ORA-00600: internal error code, arguments: [16513], [1403], [28]
20210214193332


Sun Feb 14 00:20:09 BEIST 2021
SMON: enabling cache recovery
Sun Feb 14 00:20:09 BEIST 2021
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0f27.13cd4fc3):
Sun Feb 14 00:20:09 BEIST 2021
select ctime, mtime, stime from obj$ where obj# = :1
Sun Feb 14 00:20:09 BEIST 2021
Errors in file /oracle10g/db/admin/xifenfei/udump/xifenfei1_ora_177254.trc:
ORA-00600: internal error code, arguments: [16513], [1403], [28], [], [], [], [], []
Sun Feb 14 00:20:10 BEIST 2021
Errors in file /oracle10g/db/admin/xifenfei/udump/xifenfei1_ora_177254.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [16513], [1403], [28], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 177254
ORA-1092 signalled during: ALTER DATABASE OPEN...

通过对启动过程进行跟踪

=====================
PARSING IN CURSOR #5 len=52 dep=1 uid=0 oct=3 lid=0 tim=194171381576991 hv=429618617 ad='afcee60'
select ctime, mtime, stime from obj$ where obj# = :1
END OF STMT
PARSE #5:c=0,e=257,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=194171381576990
BINDS #5:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=1104b8ed0  bln=22  avl=02  flg=05
  value=28
EXEC #5:c=0,e=422,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=194171381577472
WAIT #5: nam='db file sequential read' ela= 274 file#=1 block#=110 blocks=1 obj#=36 tim=194171381577809
WAIT #5: nam='db file sequential read' ela= 257 file#=1 block#=78943 blocks=1 obj#=36 tim=194171381578123
WAIT #5: nam='db file sequential read' ela= 253 file#=1 block#=111 blocks=1 obj#=36 tim=194171381578416
WAIT #5: nam='db file sequential read' ela= 226 file#=1 block#=62 blocks=1 obj#=18 tim=194171381578692
=====================
PARSING IN CURSOR #6 len=142 dep=2 uid=0 oct=3 lid=0 tim=194171381579134 hv=361892850 ad='df87eb0'
select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,
DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1
END OF STMT
PARSE #6:c=0,e=368,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=3,tim=194171381579133
BINDS #6:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=1105af9a8  bln=22  avl=02  flg=05
  value=92
EXEC #6:c=0,e=531,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=3,tim=194171381579736
WAIT #6: nam='db file sequential read' ela= 232 file#=1 block#=102 blocks=1 obj#=34 tim=194171381580011
WAIT #6: nam='db file sequential read' ela= 251 file#=1 block#=54 blocks=1 obj#=15 tim=194171381580307
FETCH #6:c=0,e=615,p=2,cr=2,cu=0,mis=0,r=1,dep=2,og=3,tim=194171381580369
STAT #6 id=1 cnt=1 pid=0 pos=1 obj=15 op='TABLE ACCESS BY INDEX ROWID UNDO$ (cr=2 pr=2 pw=0 time=584 us)'
STAT #6 id=2 cnt=1 pid=1 pos=1 obj=34 op='INDEX UNIQUE SCAN I_UNDO1 (cr=1 pr=1 pw=0 time=282 us)'
WAIT #5: nam='db file sequential read' ela= 260 file#=4290 block#=365090 blocks=1 obj#=0 tim=194171381580721
FETCH #5:c=10000,e=3327,p=7,cr=7,cu=0,mis=0,r=0,dep=1,og=4,tim=194171381580817
*** 2021-02-14 02:32:40.430
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [16513], [1403], [28], [], [], [], [], []
Current SQL statement for this session:
alter database open
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              70000090FC00AE4 ? 000000000 ?
ksedmp+0290          bl       ksedst               104C1CA58 ?
ksfdmp+02d8          bl       03F4BD0C             
kgeriv+0108          bl       _ptrgl               
kgesiv+0080          bl       kgeriv               7000008F6DB5890 ? 00000000C ?
                                                   7000008F6DB5890 ? 0FFFFFDC3 ?
                                                   0FFFFFFBF ?
ksesic2+0060         bl       kgesiv               5FFFEC580 ? 500000000 ?
                                                   FFFFFFFFFFEC590 ? 0F5FAF808 ?
                                                   7000008F5FAF7E8 ?
kqdpts+0158          bl       ksesic2              408100004081 ? 000000000 ?
                                                   00000057B ? 000000000 ?
                                                   00000001C ?
                                                   A006150571B05EF0 ?
                                                   1100CF0A8 ? 000000001 ?
kqrlfc+0274          bl       kqdpts               000000000 ?
kqlbplc+00b4         bl       03F4E6A8             
kqlblfc+0230         bl       kqlbplc              00000009D ?
adbdrv+1a9c          bl       03F4B66C             
opiexe+2db4          bl       adbdrv               
opiosq0+1ac8         bl       opiexe               1103B418C ? 000000000 ?
                                                   FFFFFFFFFFF9008 ?
kpooprx+016c         bl       opiosq0              300F1E1D4 ? 000000000 ?
                                                   000000000 ? A4FFFFFFFF9798 ?
                                                   000000000 ?
kpoal8+03cc          bl       kpooprx              FFFFFFFFFFFB814 ?
                                                   FFFFFFFFFFFB5C0 ?
                                                   1300000013 ? 100000001 ?
                                                   000000000 ? A40000000000A4 ?
                                                   000000000 ? 1103B4378 ?
opiodr+0b2c          bl       _ptrgl               
ttcpip+1020          bl       _ptrgl               
opitsk+117c          bl       01FBD04C             
opiino+09d0          bl       opitsk               1EFFFFD7E0 ? 000000000 ?
opiodr+0b2c          bl       _ptrgl               
opidrv+04a4          bl       opiodr               3C102B1398 ? 404C6FF30 ?
                                                   FFFFFFFFFFFF7A0 ? 0102B1390 ?
sou2o+0090           bl       opidrv               3C02705B3C ? 440663000 ?
                                                   FFFFFFFFFFFF7A0 ?
opimai_real+01bc     bl       01FB9BF4             
main+0098            bl       opimai_real          000000000 ? 000000000 ?
__start+0098         bl       main                 000000000 ? 000000000 ?
 
--------------------- Binary Stack Dump ---------------------

报错比较明显,数据库在查询obj$取obj#=28的数据的时候无法正常获取到该数据,从而报错ORA-600 16513错误.通过对相关block进行分析,发现有事务异常

BBED>  p ktbbh.ktbbhitl[1]
struct ktbbhitl[1], 24 bytes                @44
   struct ktbitxid, 8 bytes                 @44
      ub2 kxidusn                           @44       0x5c00
      ub2 kxidslt                           @46       0x0a00
      ub4 kxidsqn                           @48       0x48b4fc01
   struct ktbituba, 8 bytes                 @52
      ub4 kubadba                           @52       0x22928531
      ub2 kubaseq                           @56       0xe383
      ub1 kubarec                           @58       0x01
   ub2 ktbitflg                             @60       0x0120 (NONE)
   union _ktbitun, 2 bytes                  @62
      b2 _ktbitfsc                          @62       0
      ub2 _ktbitwrp                         @62       0x0000
   ub4 ktbitbas                             @64       0x95c2ff13

通过一些技巧处理规避掉该事务,然后启动库报熟悉的ORA-01555错误
20210214161333


相对比较简单参考(在数据库open过程中常遇到ORA-01555汇总),数据库顺利open成功,完成春节后第一个大库的恢复
20210214193344

dul 12.2完美支持Oracle 19c恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:dul 12.2完美支持Oracle 19c恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

Oracle dul 12.2完美支持12c+版本(实测19c恢复正常)

[root@ora11g dul]# ./dul

Data UnLoader: 12.2.0.0.0 Beta - Internal Only - on Wed Nov 11 00:00:39 2020
with 64-bit io functions and the decompression option

Copyright (c) 1994 2019 Bernard van Duijnen All rights reserved.

 Strictly Oracle Internal Use Only


Found db_id = 2193655303
Found db_name = UORA19C
DUL> bootstrap;
Probing file = 1, block = 520
. unloading table                BOOTSTRAP$
DUL: Warning: block number is non zero but marked deferred trying to process it anyhow
      60 rows unloaded
Reading BOOTSTRAP.dat 60 entries loaded
Parsing Bootstrap$ contents
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 12
 OBJ$: segobjno 18, file 1 block 240
 TAB$: segobjno 2, tabno 1, file 1  block 144
 COL$: segobjno 2, tabno 5, file 1  block 144
 USER$: segobjno 10, tabno 1, file 1  block 208
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$   72485 rows unloaded
. unloading table                      TAB$    2243 rows unloaded
. unloading table                      COL$  123798 rows unloaded
. unloading table                     USER$     127 rows unloaded
Reading USER.dat 127 entries loaded
Reading OBJ.dat 72485 entries loaded and sorted 72485 entries
Reading TAB.dat 2243 entries loaded
Reading COL.dat 123798 entries loaded and sorted 123798 entries
Reading BOOTSTRAP.dat 60 entries loaded

DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 12
 OBJ$: segobjno 18, file 1 block 240
 TAB$: segobjno 2, tabno 1, file 1  block 144
 COL$: segobjno 2, tabno 5, file 1  block 144
 USER$: segobjno 10, tabno 1, file 1  block 208
 TABPART$: segobjno 814, file 1 block 5424
 INDPART$: segobjno 819, file 1 block 5464
 TABCOMPART$: segobjno 836, file 1 block 5600
 INDCOMPART$: segobjno 841, file 1 block 5640
 TABSUBPART$: segobjno 826, file 1 block 5520
 INDSUBPART$: segobjno 831, file 1 block 5560
 IND$: segobjno 2, tabno 3, file 1  block 144
 ICOL$: segobjno 2, tabno 4, file 1  block 144
 LOB$: segobjno 2, tabno 6, file 1  block 144
 COLTYPE$: segobjno 2, tabno 7, file 1  block 144
 TYPE$: segobjno 740, tabno 1, file 1  block 4880
 COLLECTION$: segobjno 740, tabno 2, file 1  block 4880
 ATTRIBUTE$: segobjno 740, tabno 3, file 1  block 4880
 LOBFRAG$: segobjno 847, file 1 block 5688
 LOBCOMPPART$: segobjno 850, file 1 block 5720
 UNDO$: segobjno 15, file 1 block 224
 TS$: segobjno 6, tabno 2, file 1  block 176
 PROPS$: segobjno 127, file 1 block 1320
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$
DUL: Warning: Recreating file "OBJ.ctl"
   72485 rows unloaded
. unloading table                      TAB$
DUL: Warning: Recreating file "TAB.ctl"
    2243 rows unloaded
. unloading table                      COL$
DUL: Warning: Recreating file "COL.ctl"
  123798 rows unloaded
. unloading table                     USER$
DUL: Warning: Recreating file "USER.ctl"
     127 rows unloaded
. unloading table                  TABPART$     304 rows unloaded
. unloading table                  INDPART$     204 rows unloaded
. unloading table               TABCOMPART$       1 row  unloaded
. unloading table               INDCOMPART$       0 rows unloaded
. unloading table               TABSUBPART$      32 rows unloaded
. unloading table               INDSUBPART$       0 rows unloaded
. unloading table                      IND$    2922 rows unloaded
. unloading table                     ICOL$    4996 rows unloaded
. unloading table                      LOB$     702 rows unloaded
. unloading table                  COLTYPE$    3035 rows unloaded
. unloading table                     TYPE$    5889 rows unloaded
. unloading table               COLLECTION$    1385 rows unloaded
. unloading table                ATTRIBUTE$   15376 rows unloaded
. unloading table                  LOBFRAG$      15 rows unloaded
. unloading table              LOBCOMPPART$       0 rows unloaded
. unloading table                     UNDO$      21 rows unloaded
. unloading table                       TS$      21 rows unloaded
. unloading table                    PROPS$      42 rows unloaded
Reading USER.dat 127 entries loaded
Reading OBJ.dat 72485 entries loaded and sorted 72485 entries
Reading TAB.dat 2243 entries loaded
Reading COL.dat 123798 entries loaded and sorted 123798 entries
Reading TABPART.dat 304 entries loaded and sorted 304 entries
Reading TABCOMPART.dat 1 entries loaded and sorted 1 entries
Reading TABSUBPART.dat 32 entries loaded and sorted 32 entries
Reading INDPART.dat 204 entries loaded and sorted 204 entries
Reading INDCOMPART.dat 0 entries loaded and sorted 0 entries
Reading INDSUBPART.dat 0 entries loaded and sorted 0 entries
Reading IND.dat 2922 entries loaded
Reading LOB.dat 702 entries loaded
Reading ICOL.dat 4996 entries loaded
Reading COLTYPE.dat 3035 entries loaded
Reading TYPE.dat
DUL: Notice: Increased the size of DC_TYPES from 4096 to 32768 entries
 5889 entries loaded
Reading ATTRIBUTE.dat 15376 entries loaded
Reading COLLECTION.dat
DUL: Notice: Increased the size of DC_COLLECTIONS from 1024 to 8192 entries
 1385 entries loaded
Reading BOOTSTRAP.dat 60 entries loaded
Reading LOBFRAG.dat 15 entries loaded and sorted 15 entries
Reading LOBCOMPPART.dat 0 entries loaded and sorted 0 entries
Reading UNDO.dat 21 entries loaded
Reading TS.dat 21 entries loaded
Reading PROPS.dat 42 entries loaded
Database character set is AL32UTF8
Database national character set is AL16UTF16

实际测试一张表恢复

DUL> unload table sys.props$;
. unloading table                    PROPS$
DUL: Warning: Recreating file "SYS_PROPS.ctl"
      42 rows unloaded

20201111224237


Oracle Recovery Tools恢复MISSING00000文件故障

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle Recovery Tools恢复MISSING00000文件故障

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

前几天介绍了自己开发的Oracle Recovery Tools工具,今天有客户遇到故障,数据库由于断电无法正常启动,第三方对其进行了重建控制文件操作,但是有两个数据文件没有加入到重建控制文件脚本中,当数据库被resetlogs方式打开之后出现如下效果
20200808142132


20200808142248

数据库已经被多次resetlogs打开,这两个丢失的文件无法正常加载进去,通过Oracle Recovery Tools进行处理
20200808143542

这里可以明显看到,resetlogs scn和checkpoint scn均不一样,常规方法肯定无法处理.
20200808143640

直接在恢复库只读情况下把文件加入到数据库中
20200808142722

通过Oracle Recovery Tools点击鼠标就可以快速的解决这类问题,省去的大量的bbed处理的烦劳(在win上要去找bbed软件,要去考虑各种偏移量,修改目标值等)

重命名sys用户引起数据库启动报ORA-01092 ORA-00600 kokasgi1错误

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:重命名sys用户引起数据库启动报ORA-01092 ORA-00600 kokasgi1错误

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户和我反馈,他们为了防止有人使用sys用户进行登录,对sys用户进行了重命名,具体操作语句为:

update user$ set name='THISISSYS' where user#=0;

然后重启数据库发现数据库无法正常启动,报ORA-01092 ORA-00600 kokasgi1错误

[oracle@ecs-d75e-0618923 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on Fri Aug 7 14:40:28 2020

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup mount;
ORACLE instance started.

Total System Global Area 6881345536 bytes
Fixed Size                  2214056 bytes
Variable Size            4227860312 bytes
Database Buffers         2617245696 bytes
Redo Buffers               34025472 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kokasgi1], [], [], [], [], [], [],[], [], [], [], []
Process ID: 6269
Session ID: 530 Serial number: 3

以前写过处理过类似问题,参见:ORA-600 kokasgi1故障恢复,通过特殊处理,数据库open之后,对其user#=0的用户修改为SYS.

[oracle@ecs-d75e-0618923 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on Fri Aug 7 14:52:09 2020

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup mount;
ORACLE instance started.

Total System Global Area 6881345536 bytes
Fixed Size                  2214056 bytes
Variable Size            4227860312 bytes
Database Buffers         2617245696 bytes
Redo Buffers               34025472 bytes
Database mounted.
SQL> alter database open;

Database altered.

SQL> select name from user$ where user#=0;

NAME
------------------------------
THISISSYS

SQL> update user$ set name='SYS' where user#=0;

1 row updated.

SQL> commit;

Commit complete.

SQL> shutdown abort;
ORACLE instance shut down.
SQL> startup;
ORACLE instance started.

Total System Global Area 6881345536 bytes
Fixed Size                  2214056 bytes
Variable Size            4227860312 bytes
Database Buffers         2617245696 bytes
Redo Buffers               34025472 bytes
Database mounted.
Database altered.

至此恢复完成,再次sys是oracle内部默认的超级用户,系统很多调用在程序中写死的sys.,对其用户不能进行重命名操作.

coincidenceleague加密数据库恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:coincidenceleague加密数据库恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天接到一个朋友的数据库被加密的请求
20200805003552


这种类型的加密,以前也遇到过很多次,而且能够实现效果较好的恢复,参考:mysql数据库被加密恢复,以前这类病毒一般主要就是对文件起始位置一些block进行破坏,这次对其分析发现和以前有了本质区别(采用间隔加密,每个文件分给为10段,每次加密7个block[8k])
20200805003525

不过这点破坏难不倒我们,通过底层恢复,绝大部分情况可以直接open数据库,如果运气不太好,也可以最大限度恢复数据
20200805005652

对于oracle数据文件被加密,在大部分情况下,我们可以通过底层恢复技术,实现数据库open,然后通过exp/expdp导出数据,实现恢复之后业务直接运行的效果,最大限度提高数据恢复效果和减少业务恢复的时间.

asm磁盘dd破坏恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:asm磁盘dd破坏恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户和我们反馈,由于运维人员操作错误,对一个磁盘组的asm disk进行了dd操作,导致部分数据丢失(客户数据文件存放在两个磁盘组中,其中一个被dd掉[误以为只是存放归档,其实由于第一个磁盘组空间不足,把部分数据文件放进该磁盘组])
20200603221148


通过对asm 日志进行分析发现被dd的磁盘是一个磁盘组,以前恢复过类似的asm 磁盘被dd的案例(asm磁盘头全部损坏数据0丢失恢复,上次因为dd破坏较少,所以可以通过修复磁盘组直接恢复出来里面数据,但是这次被dd了50M,直接修复磁盘头恢复数据基本上不太可能.通过工具对其进行磁盘扫描,参考:asm disk header 彻底损坏恢复,对扫描结果进行分析,发现不少数据块是重复,无法较好的实现重组效果.
20200612002025

类似出现这样的情况一般是由于该asm磁盘组中有同一个文件号的数据多份(比如一个磁盘组中有两个库,同一个数据文件存储多份),通过方面分析,该库没有文件多份存储而且该磁盘组中只有一个数据库.进一步分析仅有的asm alert日志(大部分日志被清除),发现类似信息

Sun Mar 14 05:25:40 CST 2020
NOTE: F1X0 found on disk 0 fcn 0.60289025
NOTE: cache opening disk 0 of grp 2: HIS_FLASH00 label:HIS_FLASH00
NOTE: cache opening disk 1 of grp 2: HIS_FLASH01 label:HIS_FLASH01
NOTE: cache opening disk 2 of grp 2: HIS_FLASH02 label:HIS_FLASH02
NOTE: cache opening disk 3 of grp 2: HIS_DATA03 label:HIS_DATA03
NOTE: cache mounting (first) group 2/0xCCD84BCB (HIS_FLASH)
* allocate domain 2, invalid = TRUE 
kjbdomatt send to node 0
Sun Mar 14 05:25:40 CST 2020
NOTE: attached to recovery domain 2
Sun Mar 14 05:25:40 CST 2020
NOTE: starting recovery of thread=1 ckpt=405.816 group=2
NOTE: advancing ckpt for thread=1 ckpt=405.819
NOTE: cache recovered group 2 to fcn 0.65493064
Sun Mar 14 05:25:40 CST 2020
NOTE: LGWR attempting to mount thread 1 for disk group 2
NOTE: LGWR mounted thread 1 for disk group 2
NOTE: opening chunk 1 at fcn 0.65493064 ABA 
NOTE: seq=406 blk=820 
Sun Mar 14 05:25:40 CST 2020
NOTE: cache mounting group 2/0xCCD84BCB (HIS_FLASH) succeeded
SUCCESS: diskgroup HIS_FLASH was mounted
Sun Mar 14 05:25:42 CST 2020
NOTE: recovering COD for group 2/0xccd84bcb (HIS_FLASH)
SUCCESS: completed COD recovery for group 2/0xccd84bcb (HIS_FLASH)
Sun Mar 14 05:25:47 CST 2020
Starting background process ASMB
ASMB started with pid=17, OS id=14599

初步可以定位,很可能是由于在3月份对该磁盘组进行了扩容,从而发生了数据文件的rebalance操作,从而出现了某些block有重复现象,对于这类情况,通过结合asm字典信息进行分析可以完全规避该问题,对数据文件进行恢复,dbv进行检查,一切正常
20200612000805


对所有文件类似处理,结合正常磁盘组中数据文件,对数据库进行直接open,实现完美恢复.
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com
这次数据能够完美恢复属于侥幸,因为asm disk被dd了50M(正常情况下4个磁盘的磁盘组每个磁盘dd 50M之后,很可能有部分数据文件被覆盖,该客户该磁盘组最初是存储归档日志,因此数据文件写入位置相对比较靠后,从而没有被dd破坏)

oracle asm中drop pdb恢复方法

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:oracle asm中drop pdb恢复方法

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

最近分析研究了一种asm disk 数据文件丢失的新恢复方法,通过cod和acd进行恢复,我们对于asm 磁盘组中的文件的改变(创建,删除,扩大,缩小等)操作会体现在cod和acd中有一些体现,类似oracle 数据库中数据的变化都会体现在redo和undo中类似.可以通过对他们的分析,确认文件在asm磁盘组中的分配关系,从而实现数据文件的恢复.我这里通过模拟创建表空间,插入数据,删除表空间(同时也删除文件),然后相关cod和acd分析,实现数据文件恢复.
创建表空间

SQL> create tablespace xifenfei datafile '+data' size 1G;

Tablespace created.

SQL> alter tablespace xifenfei add datafile '+data' size 128M autoextend on;

Tablespace altered.

创建模拟表并插入数据

SQL> create table t_xifenfei tablespace xifenfei as
  2  select * from dba_objects;

Table created.

SQL> insert into t_xifenfei select * from t_xifenfei;

73013 rows created.

…………

SQL> insert into t_xifenfei select * from t_xifenfei;

18691328 rows created.

SQL> COMMIT;

Commit complete.

SQL> SELECT COUNT(1) FROM T_XIFENFEI;

  COUNT(1)
----------
  37382656

SQL> alter system checkpoint;

System altered.

SQL> select bytes/1024/1024/1024,TABLESPACE_NAME FROM USER_SEGMENTS  where segment_name='T_XIFENFEI';

BYTES/1024/1024/1024 TABLESPACE_NAME
-------------------- ------------------------------
          5.56738281 XIFENFEI

删除表空间

SQL> drop tablespace xifenfei including contents and datafiles;

Tablespace dropped.

查看alert日志信息

2020-04-23T18:23:43.088997-04:00
drop tablespace xifenfei including contents and datafiles
2020-04-23T18:23:46.226654-04:00
Deleted Oracle managed file +DATA/ORA18C/DATAFILE/xifenfei.262.1035571131
Deleted Oracle managed file +DATA/ORA18C/DATAFILE/xifenfei.263.1038507123
Completed: drop tablespace xifenfei including contents and datafiles

这里我们可以看到被删除的两个数据文件的asm number为262和263,如果要恢复该表空间数据需要先恢复出来该数据文件.由于文件被删除,文件对应存储在asm里面的找出来相关的数据分配关系才可以对其恢复出来.尝试找该文件的分配extent映射关系
尝试直接读取extent map

[root@rac18c2 ~]#  kfed read /dev/xifenfei-sdb  aus=4194304 blkn=0|grep f1b1locn
kfdhdb.f1b1locn:                     10 ; 0x0d4: 0x0000000a
[root@rac18c2 ~]#  kfed read /dev/xifenfei-sdb  aus=4194304 aun=10 blkn=262|more
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            4 ; 0x002: KFBTYP_FILEDIR
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     262 ; 0x004: blk=262
kfbh.block.obj:                       1 ; 0x008: file=1
kfbh.check:                  4132734069 ; 0x00c: 0xf6548475
kfbh.fcn.base:                     6741 ; 0x010: 0x00001a55
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfffdb.node.incarn:          1035571132 ; 0x000: A=0 NUMM=0x1edcc7de
kfffdb.node.frlist.number:          264 ; 0x004: 0x00000108
kfffdb.node.frlist.incarn:            0 ; 0x008: A=0 NUMM=0x0
kfffdb.hibytes:                       0 ; 0x00c: 0x00000000
kfffdb.lobytes:              1073750016 ; 0x010: 0x40002000
kfffdb.xtntcnt:                       0 ; 0x014: 0x00000000
kfffdb.xtnteof:                     257 ; 0x018: 0x00000101
kfffdb.blkSize:                    8192 ; 0x01c: 0x00002000
kfffdb.flags:                         1 ; 0x020: O=1 S=0 S=0 D=0 C=0 I=0 R=0 A=0
kfffdb.fileType:                      2 ; 0x021: 0x02
…………
kfffdb.mxshad:                        0 ; 0x498: 0x0000
kfffdb.mxprnt:                        0 ; 0x49a: 0x0000
kfffdb.fmtBlks:                  131073 ; 0x49c: 0x00020001
kfffde[0].xptr.au:           4294967295 ; 0x4a0: 0xffffffff
kfffde[0].xptr.disk:              65535 ; 0x4a4: 0xffff
kfffde[0].xptr.flags:                 0 ; 0x4a6: L=0 E=0 D=0 S=0 R=0 I=0
kfffde[0].xptr.chk:                  42 ; 0x4a7: 0x2a
kfffde[1].xptr.au:           4294967295 ; 0x4a8: 0xffffffff
kfffde[1].xptr.disk:              65535 ; 0x4ac: 0xffff
kfffde[1].xptr.flags:                 0 ; 0x4ae: L=0 E=0 D=0 S=0 R=0 I=0
kfffde[1].xptr.chk:                  42 ; 0x4af: 0x2a

这里的kfffdb.blkSize为8192证明以前很可能是数据文件,但是kfffde中的au和disk全部被置为f,说明extent的直接映射表已经被置空,更不用说间接extent分配映射表,也就是说这条路无法走通.变换一种思路,既然文件从asm中删除掉的extent映射关系被清空,那是否可以通过对应的acd记录来找到相关数据.通过对acd进行分析,发现在删除drop的时间点相关类似记录,通过分析对应的acd记录,发现直接extent和扩展extent分配全部被置空,无法通过该思路进行恢复
尝试acd恢复extent map

kfracdb2.lge[2].bcd[2].kfbl.blk:    262 ; 0x1cc: blk=262
kfracdb2.lge[2].bcd[2].kfbl.obj:      1 ; 0x1d0: file=1
kfracdb2.lge[2].bcd[2].kfcn.base:  6216 ; 0x1d4: 0x00001848
kfracdb2.lge[2].bcd[2].kfcn.wrap:     0 ; 0x1d8: 0x00000000
kfracdb2.lge[2].bcd[2].oplen:        20 ; 0x1dc: 0x0014
kfracdb2.lge[2].bcd[2].blkIndex:    262 ; 0x1de: 0x0106
kfracdb2.lge[2].bcd[2].flags:        28 ; 0x1e0: F=0 N=0 F=1 L=1 V=1 A=0 C=0 R=0
kfracdb2.lge[2].bcd[2].opcode:      162 ; 0x1e2: 0x00a2
kfracdb2.lge[2].bcd[2].kfbtyp:        4 ; 0x1e4: KFBTYP_FILEDIR
kfracdb2.lge[2].bcd[2].redund:       17 ; 0x1e5: SCHE=0x1 NUMB=0x1
kfracdb2.lge[2].bcd[2].pad:       63903 ; 0x1e6: 0xf99f
kfracdb2.lge[2].bcd[2].KFFFD_PEXT.xtntcnt:0 ; 0x1e8: 0x00000000
kfracdb2.lge[2].bcd[2].KFFFD_PEXT.xtntblk:0 ; 0x1ec: 0x0000
kfracdb2.lge[2].bcd[2].KFFFD_PEXT.xnum:0 ; 0x1ee: 0x0000
kfracdb2.lge[2].bcd[2].KFFFD_PEXT.xcnt:1 ; 0x1f0: 0x0001
kfracdb2.lge[2].bcd[2].KFFFD_PEXT.setflg:0 ; 0x1f2: 0x00
kfracdb2.lge[2].bcd[2].KFFFD_PEXT.flags:0 ; 0x1f3: O=0 S=0 S=0 D=0 C=0 I=0 R=0 A=0
kfracdb2.lge[2].bcd[2].KFFFD_PEXT.xptr[0].au:4294967292 ; 0x1f4: 0xfffffffc
kfracdb2.lge[2].bcd[2].KFFFD_PEXT.xptr[0].disk:0 ; 0x1f8: 0x0000
kfracdb2.lge[2].bcd[2].KFFFD_PEXT.xptr[0].flags:0 ; 0x1fa: L=0 E=0 D=0 S=0 R=0 I=0
kfracdb2.lge[2].bcd[2].KFFFD_PEXT.xptr[0].chk:41 ; 0x1fb: 0x29
kfracdb2.lge[2].bcd[2].au[0]:        10 ; 0x1fc: 0x0000000a
kfracdb2.lge[2].bcd[2].disks[0]:      0 ; 0x200: 0x0000


kfracdb2.lge[20].bcd[1].kfbl.blk:2147483648 ; 0xe54: blk=0 (indirect)
kfracdb2.lge[20].bcd[1].kfbl.obj:   262 ; 0xe58: file=262
kfracdb2.lge[20].bcd[1].kfcn.base: 3280 ; 0xe5c: 0x00000cd0
kfracdb2.lge[20].bcd[1].kfcn.wrap:    0 ; 0xe60: 0x00000000
kfracdb2.lge[20].bcd[1].oplen:       16 ; 0xe64: 0x0010
kfracdb2.lge[20].bcd[1].blkIndex:     0 ; 0xe66: 0x0000
kfracdb2.lge[20].bcd[1].flags:       28 ; 0xe68: F=0 N=0 F=1 L=1 V=1 A=0 C=0 R=0
kfracdb2.lge[20].bcd[1].opcode:     163 ; 0xe6a: 0x00a3
kfracdb2.lge[20].bcd[1].kfbtyp:      12 ; 0xe6c: KFBTYP_INDIRECT
kfracdb2.lge[20].bcd[1].redund:      17 ; 0xe6d: SCHE=0x1 NUMB=0x1
kfracdb2.lge[20].bcd[1].pad:      63903 ; 0xe6e: 0xf99f
kfracdb2.lge[20].bcd[1].KFFIX_PEXT.xtntblk:0 ; 0xe70: 0x0000
kfracdb2.lge[20].bcd[1].KFFIX_PEXT.xnum:0 ; 0xe72: 0x0000
kfracdb2.lge[20].bcd[1].KFFIX_PEXT.xcnt:1 ; 0xe74: 0x0001
kfracdb2.lge[20].bcd[1].KFFIX_PEXT.ub2spare:0 ; 0xe76: 0x0000
kfracdb2.lge[20].bcd[1].KFFIX_PEXT.xptr[0].au:4294967292 ; 0xe78: 0xfffffffc
kfracdb2.lge[20].bcd[1].KFFIX_PEXT.xptr[0].disk:0 ; 0xe7c: 0x0000
kfracdb2.lge[20].bcd[1].KFFIX_PEXT.xptr[0].flags:0 ; 0xe7e: L=0 E=0 D=0 S=0 R=0 I=0
kfracdb2.lge[20].bcd[1].KFFIX_PEXT.xptr[0].chk:41 ; 0xe7f: 0x29
kfracdb2.lge[20].bcd[1].au[0]:      296 ; 0xe80: 0x00000128
kfracdb2.lge[20].bcd[1].disks[0]:     0 ; 0xe84: 0x0000

直接通过删除的acd记录来找出来数据文件分配的extent也行不通,通过分析相关acd block,终于找到了对应的extent分配的相关记录

kfracdb2.lge[21].bcd[0].kfbl.blk:     2 ; 0x918: blk=2
kfracdb2.lge[21].bcd[0].kfbl.obj:2147483648 ; 0x91c: disk=0
kfracdb2.lge[21].bcd[0].kfcn.base: 2820 ; 0x920: 0x00000b04
kfracdb2.lge[21].bcd[0].kfcn.wrap:    0 ; 0x924: 0x00000000
kfracdb2.lge[21].bcd[0].oplen:       28 ; 0x928: 0x001c
kfracdb2.lge[21].bcd[0].blkIndex:     2 ; 0x92a: 0x0002
kfracdb2.lge[21].bcd[0].flags:       28 ; 0x92c: F=0 N=0 F=1 L=1 V=1 A=0 C=0 R=0
kfracdb2.lge[21].bcd[0].opcode:      73 ; 0x92e: 0x0049
kfracdb2.lge[21].bcd[0].kfbtyp:       3 ; 0x930: KFBTYP_ALLOCTBL
kfracdb2.lge[21].bcd[0].redund:      18 ; 0x931: SCHE=0x1 NUMB=0x2
kfracdb2.lge[21].bcd[0].pad:      63903 ; 0x932: 0xf99f
kfracdb2.lge[21].bcd[0].KFDAT_ALLOC2.vaa.curidx:2416 ; 0x934: 0x0970
kfracdb2.lge[21].bcd[0].KFDAT_ALLOC2.vaa.nxtidx:8 ; 0x936: 0x0008
kfracdb2.lge[21].bcd[0].KFDAT_ALLOC2.vaa.prvidx:8 ; 0x938: 0x0008
kfracdb2.lge[21].bcd[0].KFDAT_ALLOC2.vaa.asz:0 ; 0x93a: KFDASZ_1X
kfracdb2.lge[21].bcd[0].KFDAT_ALLOC2.vaa.frag:0 ; 0x93b: 0x00
kfracdb2.lge[21].bcd[0].KFDAT_ALLOC2.vaa.total:0 ; 0x93c: 0x0000
kfracdb2.lge[21].bcd[0].KFDAT_ALLOC2.vaa.free:0 ; 0x93e: 0x0000
kfracdb2.lge[21].bcd[0].KFDAT_ALLOC2.vaa.fnum:262 ; 0x940: 0x00000106
kfracdb2.lge[21].bcd[0].KFDAT_ALLOC2.vaa.xnum:0 ; 0x944: 0x00000000
kfracdb2.lge[21].bcd[0].KFDAT_ALLOC2.vaa.flags:8388608 ; 0x948: 0x00800000
kfracdb2.lge[21].bcd[0].KFDAT_ALLOC2.lxnum:3 ; 0x94c: 0x03
kfracdb2.lge[21].bcd[0].KFDAT_ALLOC2.spare1:0 ; 0x94d: 0x00
kfracdb2.lge[21].bcd[0].KFDAT_ALLOC2.spare2:0 ; 0x94e: 0x0000
kfracdb2.lge[21].bcd[0].au[0]:        0 ; 0x950: 0x00000000
kfracdb2.lge[21].bcd[0].au[1]:       11 ; 0x954: 0x0000000b
kfracdb2.lge[21].bcd[0].disks[0]:     0 ; 0x958: 0x0000
kfracdb2.lge[21].bcd[0].disks[1]:     0 ; 0x95a: 0x0000
kfracdb2.lge[21].bcd[1].kfbl.blk:   262 ; 0x95c: blk=262
kfracdb2.lge[21].bcd[1].kfbl.obj:     1 ; 0x960: file=1
kfracdb2.lge[21].bcd[1].kfcn.base: 3018 ; 0x964: 0x00000bca
kfracdb2.lge[21].bcd[1].kfcn.wrap:    0 ; 0x968: 0x00000000
kfracdb2.lge[21].bcd[1].oplen:       20 ; 0x96c: 0x0014
kfracdb2.lge[21].bcd[1].blkIndex:   262 ; 0x96e: 0x0106
kfracdb2.lge[21].bcd[1].flags:       28 ; 0x970: F=0 N=0 F=1 L=1 V=1 A=0 C=0 R=0
kfracdb2.lge[21].bcd[1].opcode:     162 ; 0x972: 0x00a2
kfracdb2.lge[21].bcd[1].kfbtyp:       4 ; 0x974: KFBTYP_FILEDIR
kfracdb2.lge[21].bcd[1].redund:      17 ; 0x975: SCHE=0x1 NUMB=0x1
kfracdb2.lge[21].bcd[1].pad:      63903 ; 0x976: 0xf99f
kfracdb2.lge[21].bcd[1].KFFFD_PEXT.xtntcnt:0 ; 0x978: 0x00000000
kfracdb2.lge[21].bcd[1].KFFFD_PEXT.xtntblk:0 ; 0x97c: 0x0000
kfracdb2.lge[21].bcd[1].KFFFD_PEXT.xnum:0 ; 0x97e: 0x0000
kfracdb2.lge[21].bcd[1].KFFFD_PEXT.xcnt:1 ; 0x980: 0x0001
kfracdb2.lge[21].bcd[1].KFFFD_PEXT.setflg:0 ; 0x982: 0x00
kfracdb2.lge[21].bcd[1].KFFFD_PEXT.flags:0 ; 0x983: O=0 S=0 S=0 D=0 C=0 I=0 R=0 A=0
kfracdb2.lge[21].bcd[1].KFFFD_PEXT.xptr[0].au:297 ; 0x984: 0x00000129
kfracdb2.lge[21].bcd[1].KFFFD_PEXT.xptr[0].disk:0 ; 0x988: 0x0000
kfracdb2.lge[21].bcd[1].KFFFD_PEXT.xptr[0].flags:0 ; 0x98a: L=0 E=0 D=0 S=0 R=0 I=0
kfracdb2.lge[21].bcd[1].KFFFD_PEXT.xptr[0].chk:2 ; 0x98b: 0x02
kfracdb2.lge[21].bcd[1].au[0]:       10 ; 0x98c: 0x0000000a
kfracdb2.lge[21].bcd[1].disks[0]:     0 ; 0x990: 0x0000
kfracdb2.lge[21].bcd[2].kfbl.blk:     2 ; 0x994: blk=2
kfracdb2.lge[21].bcd[2].kfbl.obj:     4 ; 0x998: file=4
kfracdb2.lge[21].bcd[2].kfcn.base: 3019 ; 0x99c: 0x00000bcb
kfracdb2.lge[21].bcd[2].kfcn.wrap:    0 ; 0x9a0: 0x00000000
kfracdb2.lge[21].bcd[2].oplen:        8 ; 0x9a4: 0x0008
kfracdb2.lge[21].bcd[2].blkIndex:     2 ; 0x9a6: 0x0002
kfracdb2.lge[21].bcd[2].flags:       28 ; 0x9a8: F=0 N=0 F=1 L=1 V=1 A=0 C=0 R=0
kfracdb2.lge[21].bcd[2].opcode:     211 ; 0x9aa: 0x00d3
kfracdb2.lge[21].bcd[2].kfbtyp:      16 ; 0x9ac: KFBTYP_COD_DATA
kfracdb2.lge[21].bcd[2].redund:      17 ; 0x9ad: SCHE=0x1 NUMB=0x1
kfracdb2.lge[21].bcd[2].pad:      63903 ; 0x9ae: 0xf99f
kfracdb2.lge[21].bcd[2].KFRCOD_DATA.offset:60 ; 0x9b0: 0x003c
kfracdb2.lge[21].bcd[2].KFRCOD_DATA.length:4 ; 0x9b2: 0x0004
kfracdb2.lge[21].bcd[2].KFRCOD_DATA.data[0]:1 ; 0x9b4: 0x01
kfracdb2.lge[21].bcd[2].KFRCOD_DATA.data[1]:0 ; 0x9b5: 0x00
kfracdb2.lge[21].bcd[2].KFRCOD_DATA.data[2]:0 ; 0x9b6: 0x00
kfracdb2.lge[21].bcd[2].KFRCOD_DATA.data[3]:0 ; 0x9b7: 0x00
kfracdb2.lge[21].bcd[2].au[0]:       16 ; 0x9b8: 0x00000010
kfracdb2.lge[21].bcd[2].disks[0]:     0 ; 0x9bc: 0x0000

对于类似这样的记录,通过汇总处理获得所有的file number对应的au extent分配记录,并且生成dd语句,然后生成文件
20200428213654


dbv检查恢复文件

[oracle@rac18c2 tmp]$ dbv file=262.dbf

DBVERIFY: Release 18.0.0.0.0 - Production on Tue Apr 28 09:45:37 2020

Copyright (c) 1982, 2018, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /tmp/262.dbf


DBVERIFY - Verification complete

Total Pages Examined         : 131072
Total Pages Processed (Data) : 123400
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 631
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 7041
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 10001146011 (2.1411211419)

[oracle@rac18c2 tmp]$ dbv file=263.dbf

DBVERIFY: Release 18.0.0.0.0 - Production on Tue Apr 28 09:51:05 2020

Copyright (c) 1982, 2018, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /tmp/263.dbf



DBVERIFY - Verification complete

Total Pages Examined         : 643584
Total Pages Processed (Data) : 595146
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 821
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 36865
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 10001153042 (2.1411218450)
[oracle@rac18c2 tmp]$ 

dul确认恢复文件中数据

DUL>  scan database
start scan database in parallel 1...
scan database completed.
DUL>  sample all segment 
start get segment info: data_obj#: 74635
finish get segment info: data_obj#: 74635
guess col def: 22
write segment info: 74635, 1, 8, 22
write sample rows: 10000
DUL>  unload 74635
 2020-04-24 22:32:11 unloading table segment 74635...
 2020-04-24 22:35:36 unloaded 37382656 rows.
DUL>

通过dbv和实际数据条数对比,此种恢复恢复的数据完全正常,不用使用底层碎片扫描,亦可恢复asm中被删除数据文件数据.在某些特殊情况下,此类方法配合底层碎片恢复,可以实现更加完美的恢复效果.对于比较典型的oracle pdb被删除(因为有多个数据文件的文件号是一样的,无法直接通过底层碎片扫描恢复),通过此类方法可以非常好的恢复出来.
类似文章参考:
asm disk header 彻底损坏恢复
ASM未正常启动,使用dd找回数据文件
asm磁盘组操作不当导致数据文件丢失恢复
如果你不幸遭遇asm 数据文件被删除/丢失,或者误删除pdb等相关事宜,如果需要恢复可以联系我们,提供专业数据库恢复服务
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

asm disk被加入vg恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:asm disk被加入vg恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到客户恢复请求:把oracle asm datagroup中的一个磁盘增加到vg中,现在磁盘组无法mount,数据库无法正常启动.远程登录现场进行分析发现情况如下:
操作系统层面分析
history操作记录
add_asm_disk_to_vg


这里比较明显把一个磁盘做成pv,并且加入到vg中,然后再分配199G给lv_home,系统层面分析lvm信息

--查看pv信息
[root@xff1 ~]# pvdisplay 
  --- Physical volume ---
  PV Name               /dev/sda2
  VG Name               VolGroup
  PV Size               277.98 GiB / not usable 3.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              71161
  Free PE               0
  Allocated PE          71161
  PV UUID               F6QO3f-065n-mwTW-Xbq2-Xx2y-c8HD-Tkr7V7
   
  --- Physical volume ---
  PV Name               /dev/sdg    <----新加入的磁盘
  VG Name               VolGroup
  PV Size               200.00 GiB / not usable 4.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              51199
  Free PE               255
  Allocated PE          50944
  PV UUID               i69vUG-nCIK-dtxL-FvpD-2WZd-bvLv-n7lwrb

[root@xff1 ~]# lvdisplay 
  --- Logical volume ---
  LV Path                /dev/VolGroup/lv_root
  LV Name                lv_root
  VG Name                VolGroup
  LV UUID                JUNnkN-m4zq-D0gh-h42b-cUM1-Wh1q-ZMtQE4
  LV Write Access        read/write
  LV Creation host, time localhost.localdomain, 2017-07-19 20:08:47 +0800
  LV Status              available
  # open                 1
  LV Size                50.00 GiB
  Current LE             12800
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
   
  --- Logical volume ---
  LV Path                /dev/VolGroup/lv_home
  LV Name                lv_home
  VG Name                VolGroup
  LV UUID                eZTkLt-cNGX-371i-m8Bd-VdD9-q6Hz-wYDRIJ
  LV Write Access        read/write
  LV Creation host, time localhost.localdomain, 2017-07-19 20:08:54 +0800
  LV Status              available
  # open                 1
  LV Size                422.97 GiB      <-----lv大小编程422G,应该是被扩了199G后结果
  Current LE             108281
  Segments               2
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:2
   
  --- Logical volume ---
  LV Path                /dev/VolGroup/lv_swap
  LV Name                lv_swap
  VG Name                VolGroup
  LV UUID                54P9ok-VpwO-zM68-hvwY-9GBf-89yb-8xQAMn
  LV Write Access        read/write
  LV Creation host, time localhost.localdomain, 2017-07-19 20:09:23 +0800
  LV Status              available
  # open                 1
  LV Size                4.00 GiB
  Current LE             1024
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1


[root@xff1 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup-lv_root
                       50G  3.9G   43G   9% /
tmpfs                  63G  509M   63G   1% /dev/shm
/dev/sda1             477M   44M  408M  10% /boot
/dev/mapper/VolGroup-lv_home
                      417G  226G  170G  58% /home  <----增加了199g空间,剩余只剩170G,证明增加空间之后最少使用了30G以上  

基于这样的情况,基本上可以确定sdg盘加入VolGroup中并且被分配给 lv_home中,而且还写入了数据(/home空闲空间只剩余170G,lv_home当时扩了199G).
asm层面分析
asm磁盘组无法mount,提示缺少一块磁盘

SQL> ALTER DISKGROUP DATA MOUNT  /* asm agent *//* {1:12056:279} */ 
NOTE: cache registered group DATA number=1 incarn=0xa1dbff16
NOTE: cache began mount (first) of group DATA number=1 incarn=0xa1dbff16
NOTE: Assigning number (1,2) to disk (/dev/asmdisk3)
NOTE: Assigning number (1,1) to disk (/dev/asmdisk2)
Sat Apr 25 13:04:58 2020
ERROR: no read quorum in group: required 1, found 0 disks
NOTE: cache dismounting (clean) group 1/0xA1DBFF16 (DATA) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 81552, image: oracle@rac2db1 (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0xA1DBFF16 (DATA) 
NOTE: cache ending mount (fail) of group DATA number=1 incarn=0xa1dbff16
NOTE: cache deleting context for group DATA 1/0xa1dbff16
GMON dismounting group 1 at 19 for pid 30, osid 81552
NOTE: Disk DATA_0001 in mode 0x9 marked for de-assignment
NOTE: Disk DATA_0002 in mode 0x9 marked for de-assignment
ERROR: diskgroup DATA was not mounted
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15040: diskgroup is incomplete
ERROR: ALTER DISKGROUP DATA MOUNT  /* asm agent *//* {1:12056:279} */

kfed分析磁盘信息
kfed_lvm


报错比较明显asm disk磁盘头被lvm的信息取代(因为asm disk 被加入到vg中),根据前面的分析,该磁盘被写入数据很可能超过30G,使用kfed分析一个随意au,确认被破坏,证明开始判断基本正确

root@xff1:/home/oracle11g$kfed read /dev/asmdisk1 aun=10000
kfbh.endian:                         51 ; 0x000: 0x33
kfbh.hard:                           55 ; 0x001: 0x37
kfbh.type:                           32 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                         42 ; 0x003: 0x2a
kfbh.block.blk:              1329801248 ; 0x004: blk=1329801248
kfbh.block.obj:              1128615502 ; 0x008: file=347726
kfbh.check:                  1094999892 ; 0x00c: 0x41445f54
kfbh.fcn.base:                675103060 ; 0x010: 0x283d4154
kfbh.fcn.wrap:               1448232275 ; 0x014: 0x56524553
kfbh.spare1:                 1598374729 ; 0x018: 0x5f454349
kfbh.spare2:                 1162690894 ; 0x01c: 0x454d414e
7F7843EAD400 2A203733 4F432820 43454E4E 41445F54  [37 * (CONNECT_DA]
7F7843EAD410 283D4154 56524553 5F454349 454D414E  [TA=(SERVICE_NAME]
7F7843EAD420 6361723D 29626432 44494328 5250283D  [=rac2db)(CID=(PR]
7F7843EAD430 4152474F 3A443D4D 4341505C DFCF3153  [OGRAM=D:\PACS1..]
7F7843EAD440 B3BEB7BB 6369445C 65536D6F 72657672  [....\DicomServer]
7F7843EAD450 445C524D 6D6F6369 76726553 524D7265  [MR\DicomServerMR]
7F7843EAD460 6578652E 4F482829 573D5453 362D4E49  [.exe)(HOST=WIN-6]
7F7843EAD470 51414C38 54553645 28294A30 52455355  [8LAQE6UT0J)(USER]
7F7843EAD480 6D64413D 73696E69 74617274 2929726F  [=Administrator))]
7F7843EAD490 202A2029 44444128 53534552 5250283D  [) * (ADDRESS=(PR]
7F7843EAD4A0 434F544F 743D4C4F 28297063 54534F48  [OTOCOL=tcp)(HOST]
7F7843EAD4B0 2E30313D 2E303831 30332E31 4F502829  [=10.180.1.30)(PO]
7F7843EAD4C0 343D5452 37333539 2A202929 74736520  [RT=49537)) * est]
7F7843EAD4D0 696C6261 2A206873 63617220 20626432  [ablish * rac2db ]
7F7843EAD4E0 3231202A 0A343135 2D534E54 31353231  [* 12514.TNS-1251]
7F7843EAD4F0 54203A34 6C3A534E 65747369 2072656E  [4: TNS:listener ]
7F7843EAD500 73656F64 746F6E20 72756320 746E6572  [does not current]
7F7843EAD510 6B20796C 20776F6E 7320666F 69767265  [ly know of servi]
7F7843EAD520 72206563 65757165 64657473 206E6920  [ce requested in ]
7F7843EAD530 6E6E6F63 20746365 63736564 74706972  [connect descript]
………………
7F7843EAE300 6F636944 7265536D 4D726576 69445C52  [DicomServerMR\Di]
7F7843EAE310 536D6F63 65767265 2E524D72 29657865  [comServerMR.exe)]
7F7843EAE320 534F4828 49573D54 4F302D4E 314B304A  [(HOST=WIN-0OJ0K1]
7F7843EAE330 4955304E 55282954 3D524553 696D6441  [N0UIT)(USER=Admi]
7F7843EAE340 7473696E 6F746172 29292972 28202A20  [nistrator))) * (]
7F7843EAE350 52444441 3D535345 4F525028 4F434F54  [ADDRESS=(PROTOCO]
7F7843EAE360 63743D4C 48282970 3D54534F 312E3031  [L=tcp)(HOST=10.1]
7F7843EAE370 312E3038 2930332E 524F5028 35353D54  [80.1.30)(PORT=55]
7F7843EAE380 29383632 202A2029 61747365 73696C62  [268)) * establis]
7F7843EAE390 202A2068 32636172 2A206264 35323120  [h * rac2db * 125]
7F7843EAE3A0 540A3431 312D534E 34313532 4E54203A  [14.TNS-12514: TN]
7F7843EAE3B0 696C3A53 6E657473 64207265 2073656F  [S:listener does ]
7F7843EAE3C0 20746F6E 72727563 6C746E65 6E6B2079  [not currently kn]
7F7843EAE3D0 6F20776F 65732066 63697672 65722065  [ow of service re]
7F7843EAE3E0 73657571 20646574 63206E69 656E6E6F  [quested in conne]
7F7843EAE3F0 64207463 72637365 6F747069 34320A72  [ct descriptor.24]
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][32]

通过上述kfed可以看到第10000 au的位置被写入的是数据库异常之后listener.log的信息(该数据库安装在/home目录中),进一步证明覆盖,通过以下信息证明sdg就是asmdisk1

[root@xff1 dev]# ls -l sdg
brw-rw---- 1 root disk 8,  96 Apr 25 00:05 sdg
[root@xff1 dev]# ls -l asmdisk1
brw-rw---- 1 grid asmadmin 8,  96 Apr 25 00:05 asmdisk1

基于现在的情况,data磁盘组是由三块 200G的磁盘组成,第一块磁盘被意外加入vg,并且写入数据大于30G,无法从asm层面直接通过kfed修复磁盘组,然后直接mount,只能通过oracle asm磁盘数据块重组技术(asm disk header 彻底损坏恢复)实现没有覆盖数据的恢复.
20200426202310


该客户运气还不错,通过仅剩的2019年12月份几天的不成功备份找出来所有的数据文件(无归档),然后强制拉库成功.通过碎片恢复的最新的数据文件数据结合2019年12月份备份,实现绝大部分业务数据恢复,最大限度减少客户损失.对于oracle rac数据库服务器磁盘操作需要谨慎.
如果不幸有类似oracle asm disk被破坏(格式化,dd部分,做成lv等),需要进行恢复支持,可以联系我们,做专业的恢复评估,最大限度,最快速度抢救数据,减少损失
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com
恢复过部分asm被格式化案例:
又一例asm格式化文件系统恢复
一次完美的asm disk被格式化ntfs恢复
oracle asm disk格式化恢复—格式化为ext4文件系统
oracle asm disk格式化恢复—格式化为ntfs文件系统

aix平台tab$被删除可能出现ORA-600 [16703], [1403], [28]错误

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:aix平台tab$被删除可能出现ORA-600 [16703], [1403], [28]错误

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

以前在恢复过程中遇到过ORA-00600: internal error code, arguments: [16703], [1403], [28]错误(10g数据库遭遇ORA-600 16703)以为是因为10g版本的tab$记录被删除的原因导致报错和最常见的ORA-00600: internal error code, arguments: [16703], [1403], [20]不完全一致(警告:互联网中有oracle介质被注入恶意程序导致—ORA-600 16703),最近又遇到一个ORA-600 16703 1403 28错误的case,而且数据库版本是11.2.0.4 for aix平台,进一步说明该问题不是由于10g和11g的tab$被删除的区别导致,更多可能是由于操作系统不一样,数据库启动基表访问顺序不一致导致,特此进行说明.
数据库启动成功后报错

Wed Apr 01 22:36:19 2020
Completed: ALTER DATABASE OPEN /* db agent *//* {2:54387:2} */
Wed Apr 01 22:36:19 2020
Starting background process CJQ0
Wed Apr 01 22:36:19 2020
CJQ0 started with pid=53, OS id=7078224 
Wed Apr 01 22:36:21 2020
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_ora_7668220.trc  (incident=40337):
ORA-00600: internal error code, arguments: [kzrini:!uprofile], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl2/incident/incdir_40337/orcl2_ora_7668220_i40337.trc
Wed Apr 01 22:36:21 2020
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_ora_6881718.trc  (incident=40369):
ORA-00600: internal error code, arguments: [kzrini:!uprofile], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl2/incident/incdir_40369/orcl2_ora_6881718_i40369.trc
Setting Resource Manager plan SCHEDULER[0x32DB]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter

Wed Apr 01 22:51:16 2020
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_smon_7078802.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00957: duplicate column name
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_smon_7078802.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00957: duplicate column name
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_smon_7078802.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00957: duplicate column name

Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_m000_4850312.trc  (incident=48045):
ORA-00600: internal error code, arguments: [kdfReserveSingle_1], [0], [65280], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl2/incident/incdir_48045/orcl2_m000_4850312_i48045.trc
Thu Apr 02 00:59:35 2020
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_smon_7078802.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00957: duplicate column name
Thu Apr 02 00:59:36 2020
DDE: Problem Key 'ORA 600 [kzrini:!uprofile]' was completely flood controlled (0x6)
Further messages for this problem key will be suppressed for up to 10 minutes
Thu Apr 02 00:59:37 2020
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_m000_4850312.trc  (incident=48046):
ORA-00600: internal error code, arguments: [kewrose_1], [600], 
[ORA-00600: internal error code, arguments: [kdfReserveSingle_1], [0], [65280], [], [], [], []

数据库再次重启报错

Completed: ALTER DATABASE MOUNT /* db agent *//* {1:25340:2} */
ALTER DATABASE OPEN /* db agent *//* {1:25340:2} */
This instance was first to open
Picked broadcast on commit scheme to generate SCNs
Thu Apr 02 02:17:59 2020
Thread 2 opened at log sequence 13485
  Current log# 3 seq# 13485 mem# 0: +DATA/orcl/onlinelog/group_3.265.1003137665
  Current log# 3 seq# 13485 mem# 1: +FLASH/orcl/onlinelog/group_3.259.1003137677
Successful open of redo thread 2
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Apr 02 02:17:59 2020
SMON: enabling cache recovery
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_ora_7799020.trc  (incident=48395):
ORA-00600: internal error code, arguments: [16703], [1403], [28], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl2/incident/incdir_48395/orcl2_ora_7799020_i48395.trc
Thu Apr 02 02:18:01 2020
Thu Apr 02 02:18:01 2020
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_ora_7799020.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [16703], [1403], [28], [], [], [], [], [], [], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_ora_7799020.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [16703], [1403], [28], [], [], [], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 7799020): terminating the instance due to error 704
Instance terminated by USER, pid = 7799020
ORA-1092 signalled during: ALTER DATABASE OPEN /* db agent *//* {1:25340:2} */...

对于此类问题,通过分析,确定也是由于DBMS_SUPPORT_DBMONITORP恶意脚本导致tab$记录被删除,导致数据库启动异常,处理方法基本上就是对tab$进行恢复,然后open数据库.

expdp dmp被加密破坏恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:expdp dmp被加密破坏恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友oracle数据库dmp备份被加密,后缀为:.DMP.voyager,通过分析发现文件加密2M左右
20200320134933


这里可以看出来dmp文件为expdp方式导出(expdp本质上xml方式存储,exp使用直接二进制方式存储),通过工具分析可恢复表情况.
通过工具对该dmp文件进行分析

CPFL> OPEN F:\BaiduNetdisk\KINGDEE85GH_2020-03-17.DMP.voyager
TABLE_NAME                                         START_POS       DATA_BYTE     
-------------------------------------------------- --------------- --------------- 
KINGDEE85GH.T_WFD_PROCESSDEF                       116300288       648396035       
KINGDEE85GH.T_DYN_DYNAMICCONFIGURE                 864710656       181453794       
KINGDEE85GH.T_RPTS_STORAGEFILEDATA                 1078767616      21548951        
KINGDEE85GH.T_BOT_RULESEGMENT                      1100324864      10372516        
KINGDEE85GH.T_LOG_APP                              1110712320      12603573        
KINGDEE85GH.T_PM_PERMITEM                          1123336192      7282412         
KINGDEE85GH.T_PM_USERORGPERM                       1130635264      6692320         
KINGDEE85GH.T_DYN_APPSOLUTION                      1137336320      801697          
KINGDEE85GH.T_PM_MAINMENUITEM                      1138155520      3573943         
KINGDEE85GH.T_PM_PERMUIGROUP                       1141751808      2159245         
KINGDEE85GH.T_SYS_ENTITYREF                        1143922688      4183869         
KINGDEE85GH.T_PM_ROLEPERM                          1148116992      2758960         
KINGDEE85GH.T_BAS_SYSMENUITEM                      1150885888      3304627         
KINGDEE85GH.T_JP_PAGE                              1154211840      3019174  
…………      
KINGDEE85GH.T_XT_CHECKTIME                         1212776448      41              
KINGDEE85GH.T_XT_SYNCHTIME                         1212784640      41              
SYSTEM.SYS_EXPORT_SCHEMA_02                        1212792832      215423380    
-------------------------------------------------- --------------- --------------- 
Scanned Find 895 segments. 

通过这个基本上可以确定丢失了100多M数据,其他数据理论上可以恢复.
创建用户

SQL> create user KINGDEE85GHidentified by oracle;

User created.

SQL> grant dba to KINGDEE85GH;

Grant succeeded.

unexpdp数据(自动创建表和导入数据)

CPFL> unexpdp table KINGDEE85GH.T_WFD_PROCESSDEF

unexpdp table: KINGDEE85GH.T_WFD_PROCESSDEF storage(START_POSITION:116300288 DATA_BYTE:748396035)
824 rows unexpdp

查看恢复结果
20200320142445
20200320142034


如果你有oracle expdp dmp被加密或者破坏,无法正常导入数据库,可以联系我们对其进行恢复处理:提供(ORACLE数据库恢复技术支持):
Phone:17813235971    Q Q:107644445    E-Mail:dba@xifenfei.com
如果你的oracle dmp是exp方式导出,也可以联系我们对其进行处理,参见:
exp dmp文件损坏恢复
oracle dmp被加密恢复