asm disk被加入到另外一个磁盘组故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:asm disk被加入到另外一个磁盘组故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友在aix环境对其中一个rac的asm磁盘组进行扩容
add_disk


之后另外一套rac的磁盘组直接dismount

Wed Aug 23 12:44:02 2023
NOTE: SMON starting instance recovery for group DATA domain 2 (mounted)
NOTE: F1X0 found on disk 0 au 2 fcn 0.128808679
NOTE: SMON skipping disk 7 - no header
NOTE: cache initiating offline of disk 7 group DATA
NOTE: process _smon_+asm1 (1770932) initiating offline of disk 7.3422955792 (DATA_0007) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 7/0xcc062910, mask = 0x6a, op = clear
Wed Aug 23 12:44:02 2023
GMON updating disk modes for group 2 at 7 for pid 17, osid 1770932
ERROR: Disk 7 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Wed Aug 23 12:44:02 2023
NOTE: cache dismounting (not clean) group 2/0x7FE6D808 (DATA) 
WARNING: Offline for disk DATA_0007 in mode 0x7f failed.
Wed Aug 23 12:44:02 2023
NOTE: halting all I/Os to diskgroup 2 (DATA)
ERROR: No disks with F1X0 found on disk group DATA
NOTE: aborting instance recovery of domain 2 due to diskgroup dismount
NOTE: SMON skipping lock domain (2) validation because diskgroup being dismounted
Abort recovery for domain 2
Wed Aug 23 12:44:02 2023
ERROR: ORA-15130 in COD recovery for diskgroup 2/0x7fe6d808 (DATA)
ERROR: ORA-15130 thrown in RBAL for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_2360526.trc:
ORA-15130: diskgroup "DATA" is being dismounted
[

再次尝试mount该磁盘组,报ORA-15042和ORA-15038错误

SQL> alter diskgroup data mount 
NOTE: cache registered group DATA number=2 incarn=0x79e6d861
NOTE: cache began mount (first) of group DATA number=2 incarn=0x79e6d861
NOTE: Assigning number (2,0) to disk (/dev/rhdisk31)
NOTE: Assigning number (2,3) to disk (/dev/rhdisk33)
NOTE: Assigning number (2,4) to disk (/dev/rhdisk34)
NOTE: Assigning number (2,5) to disk (/dev/rhdisk35)
NOTE: Assigning number (2,6) to disk (/dev/rhdisk36)
NOTE: Assigning number (2,9) to disk (/dev/rhdisk39)
NOTE: Assigning number (2,1) to disk (/dev/rhdisk8)
NOTE: Assigning number (2,2) to disk (/dev/rhdisk9)
Wed Aug 23 12:58:46 2023
NOTE: GMON heartbeating for grp 2
GMON querying group 2 at 11 for pid 27, osid 3736034
NOTE: Assigning number (2,7) to disk ()
NOTE: Assigning number (2,8) to disk ()
GMON querying group 2 at 12 for pid 27, osid 3736034
NOTE: cache dismounting (clean) group 2/0x79E6D861 (DATA) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 3736034, image: oracle@hbbz01 (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 2/0x79E6D861 (DATA) 
NOTE: cache ending mount (fail) of group DATA number=2 incarn=0x79e6d861
NOTE: cache deleting context for group DATA 2/0x79e6d861
GMON dismounting group 2 at 13 for pid 27, osid 3736034
NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0002 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0003 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0004 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0005 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0006 in mode 0x7f marked for de-assignment
NOTE: Disk  in mode 0x7f marked for de-assignment
NOTE: Disk  in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0009 in mode 0x7f marked for de-assignment
ERROR: diskgroup DATA was not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "8" is missing from group number "2" 
ORA-15042: ASM disk "7" is missing from group number "2" 
ORA-15038: disk '/dev/rhdisk37' mismatch on 'Time Stamp' with target disk group [2129689239] [2062898314]
ERROR: alter diskgroup data mount

怀疑把报错这个磁盘组的rhdisk37加入到另外一套rac的asm中了(也就是说两套asm使用了同一块磁盘),aix操作系统层面分析确认

---对asm扩容的机器上
# lscfg -vpl hdisk15
  hdisk15          U78C5.001.DQD076A-P2-C4-T1-W200C00A098BC9A83-L0  MPIO NetApp FCP Default PCM Disk

        Manufacturer................NETAPP  
        Machine Type and Model......LUN C-Mode      
        ROS Level and ID............9000
        Serial Number...............80DYz]L/OpCA
        Device Specific.(Z0)........FAS8020         


  PLATFORM SPECIFIC

  Name:  disk
    Node:  disk
    Device Type:  block

---磁盘组dismount的机器上
# lscfg -vpl hdisk37      
  hdisk37          U5802.001.9K87776-P1-C1-T1-W200500A098BC9A83-L0  MPIO NetApp FCP Default PCM Disk

        Manufacturer................NETAPP  
        Machine Type and Model......LUN C-Mode      
        ROS Level and ID............9000
        Serial Number...............80DYz]L/OpCA
        Device Specific.(Z0)........FAS8020         


  PLATFORM SPECIFIC

  Name:  disk
    Node:  disk
    Device Type:  block

通过lscfg 命令确认两套rac使用了同一块盘导致一个磁盘组异常,在新加的机器上查询确认新盘被破坏情况(新加入的磁盘由于reblance操作,已经被写入了380G左右数据[也就意味着这个磁盘在老磁盘组中最少会丢失380G数据]
20230905140603


对于这种情况,dismount磁盘组是外部冗余不可能直接mount起来,只能通过以前处理的类似方法:
asm disk header 彻底损坏恢复
asm磁盘加入vg恢复
asm磁盘dd破坏恢复
asm disk 磁盘部分被清空恢复
再一例asm disk被误加入vg并且扩容lv恢复
fdisk分区导致asm disk破坏数据库恢复
再一起asm disk被格式化成ext3文件系统故障恢复
oracle asm disk格式化恢复—格式化为ext4文件系统
oracle asm disk格式化恢复—格式化为ntfs文件系统
ORA-15063: ASM discovered an insufficient number of disks for diskgroup 恢复
通过底层处理恢复出来没有覆盖的数据块中数据
20230827200941

再使用dul恢复出来其中数据,完成这次故障的核心数据恢复

ORA-600 ksuloget2 恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 ksuloget2 恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户在win 32位的操作系统上调至sga超过2G,数据库运行过程中报ORA-600 ksuloget2错误

Thread 1 cannot allocate new log, sequence 43586
Checkpoint not complete
  Current log# 1 seq# 43585 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Fri Aug 04 14:57:02 2023
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_42996.trc  (incident=67481):
ORA-00600: 内部错误代码, 参数: [ksuloget2], [0xFEBA6208], [0xFEBA3B08], [500], [0xFEBA622C], [], [], [], [], []
Thread 1 advanced to log sequence 43586 (LGWR switch)
  Current log# 2 seq# 43586 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG

重启数据库,进行尝试恢复继续报ORA-600 ksuloget2

Thu Aug 17 17:38:27 2023
ALTER DATABASE RECOVER  database using backup controlfile  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 24 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile  ...
Thu Aug 17 17:39:01 2023
ALTER DATABASE RECOVER LOGFILE 'D:\oracle\flash_recovery_area\orcl\ARCHIVELOG\2023_08_04\REDO03.LOG'  
Media Recovery Log D:\oracle\flash_recovery_area\orcl\ARCHIVELOG\2023_08_04\REDO03.LOG
Thu Aug 17 17:39:01 2023
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110724):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00342: archived log does not have expected resetlogs SCN 685171428
ORA-00334: archived log: 'D:\ORACLE\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2023_08_04\REDO03.LOG'
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc:
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00342: archived log does not have expected resetlogs SCN 685171428
ORA-00334: archived log: 'D:\ORACLE\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2023_08_04\REDO03.LOG'
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_5604.trc  (incident=110709):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00342: archived log does not have expected resetlogs SCN 685171428
ORA-00334: archived log: 'D:\ORACLE\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2023_08_04\REDO03.LOG'
Incident details in: d:\oracle\diag\rdbms\orcl\orcl\incident\incdir_110709\orcl_ora_5604_i110709.trc
ORA-600 signalled during:ALTER DATABASE RECOVER LOGFILE 'D:\oracle\flash_recovery_area\orcl\2023_08_04\REDO03.LOG'
ALTER DATABASE RECOVER CANCEL 
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110725):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110726):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc:
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110727):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_pr00_5528.trc  (incident=110728):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
Errors in file d:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_5604.trc  (incident=110710):
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [], []
ORA-00600: internal error code, arguments: [ksuloget2], [0xFEBA6E38], [0xFEBA3B08], [500], [0xFEBA6E5C], [], [], [
Incident details in: d:\oracle\diag\rdbms\orcl\orcl\incident\incdir_110710\orcl_ora_5604_i110710.trc

由于是应用日志失败,屏蔽日志一致性,强制打开数据库,检查数据ok,业务可以直接使用,对于这类问题,官方建议:ORA-600: [Ksuloget2] Hit on Windows When SGA Greater Than 1G (Doc ID 836109.1)
20230819105750


ORA-10485: Real-Time Query cannot be enabled while applying migration redo.

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-10485: Real-Time Query cannot be enabled while applying migration redo.

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

对于数据库打psu和jvm patch之后,dg备库同步会出现类似ORA-10485: Real-Time Query cannot be enabled while applying migration redo.异常

Tue Aug 15 18:48:18 2023
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Attempt to start background Managed Standby Recovery process (orcl)
Tue Aug 15 18:48:18 2023
MRP0 started with pid=33, OS id=15486
MRP0: Background Managed Standby Recovery process started (orcl)
 started logmerger process
Tue Aug 15 18:48:23 2023
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 80 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Recovery of Online Redo Log: Thread 1 Group 10 Seq 106115 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/orcl/std_redo10.log
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Tue Aug 15 18:49:50 2023
RFS[1]: Assigned to RFS process 16049
RFS[1]: Opened log for thread 1 sequence 106117 dbid 1490144467 branch 962363734
Tue Aug 15 18:49:50 2023
RFS[2]: Assigned to RFS process 16051
RFS[2]: Selected log 10 for thread 1 sequence 106115 dbid 1490144467 branch 962363734
Tue Aug 15 18:49:50 2023
RFS[3]: Assigned to RFS process 16053
RFS[3]: Opened log for thread 1 sequence 106116 dbid 1490144467 branch 962363734
Archived Log entry 106102 added for thread 1 sequence 106116 rlc 962363734 ID 0x58d223d3 dest 2:
RFS[3]: Opened log for thread 1 sequence 106118 dbid 1490144467 branch 962363734
RFS[2]: Opened log for thread 1 sequence 106119 dbid 1490144467 branch 962363734
Tue Aug 15 18:49:50 2023
Archived Log entry 106103 added for thread 1 sequence 106115 ID 0x58d223d3 dest 1:
Tue Aug 15 18:49:50 2023
Media Recovery Log /u01/app/oracle/fast_recovery_area/ORCLDG/archivelog/2023_08_15/o1_mf_1_106116_lfpp2ghc_.arc
Errors with log /u01/app/oracle/fast_recovery_area/ORCLDG/archivelog/2023_08_15/o1_mf_1_106116_lfpp2ghc_.arc
MRP0: Background Media Recovery terminated with error 10485
Errors in file /u01/app/oracle/diag/rdbms/orcldg/orcl/trace/orcl_pr00_15488.trc:
ORA-10485: Real-Time Query cannot be enabled while applying migration redo.
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Tue Aug 15 18:49:50 2023
MRP0: Background Media Recovery process shutdown (orcl)

这种情况,是由于数据库在upgrade模式下产生日志无法正常被adg实时应用,我一般是临时rman应用备库日志解决,官方解决方案:MRP process getting terminated with error ORA-10485 (Doc ID 1618485.1)

11g CASE

---------

1> Stop DG broker (if used)
   i.e., on primary and standby

SQL> alter system set dg_broker_start = false scope = both sid = '*' ;
2> Stop managed recovery in the standby, shutdown the standby and startup mount.
    start managed recovery without real time apply.

SQL> alter database recover managed standby database disconnect ;
3> Wait until all the redo is applied to the standby and the standby is in sync.
     Do a couple of log switches on the primary, all instances if RAC, and let them apply to the standby.

4> Shutdown the standby and startup mount
   start managed recovery with real time apply.

SQL> alter database recover managed standby database using current logfile disconnect ; 
5> Restart broker(if used).

on primary and standby

SQL> alter system set dg_broker_start = true scope = both sid = '*' ;
 

12c CASE

----------

In 12c and later, start Archived log apply using below command with ARCHIVED LOGFILE option:

SQL> alter database recover managed standby database using archived logfile disconnect;
Wait until all the redo is applied to the standby and the standby is in sync. 
  Do a couple of log switches on the primary, all instances if RAC, and let them apply to the standby.

To stop Redo Apply, Issue the following SQL statement:
SQL>  alter database recover managed standby database cancel; 

start managed recovery with real time apply, Issue the following SQL statement

SQL> alter database recover managed standby database disconnect;

Patch SCN一键解决ORA-600 2662故障

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Patch SCN一键解决ORA-600 2662故障

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户强制重启库之后,数据库启动报ORA-600 2037,ORA-745 kcbs_reset_pool/kcbzre1等错误

Wed Aug 09 13:25:38 2023
alter database mount exclusive
Successful mount of redo thread 1, with mount id 1672229586
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Started redo scan
Completed redo scan
 read 84074 KB redo, 37037 data blocks need recovery
Wed Aug 09 13:25:49 2023
Started redo application at
 Thread 1: logseq 998542, block 61480
Recovery of Online Redo Log: Thread 1 Group 1 Seq 998542 Reading mem 0
  Mem# 0: E:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Recovery of Online Redo Log: Thread 1 Group 2 Seq 998543 Reading mem 0
  Mem# 0: E:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02.LOG
Wed Aug 09 13:25:49 2023
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x1350B61, kcbs_reset_pool()+3135]
Wed Aug 09 13:25:49 2023
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_p011_2508.trc  (incident=1203440):
ORA-00600: internal error code, arguments: [2037], [1981399269], [3236233216], [37], [193], [11], [2842064409], [100728832], [], [], [], 
Wed Aug 09 13:25:49 2023
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_p002_2472.trc  (incident=1203368):
ORA-00600: internal error code, arguments: [2037], [1882432643], [2760048640], [77], [31], [11], [2915397683], [100731648], [], [], [], 
ERROR: Unable to normalize symbol name for the following short stack (at offset 199):
dbgexProcessError()+193<-dbgeExecuteForError()+65<-dbgePostErrorKGE()+1726<-dbkePostKGE_kgsf()+75<-kgeade()+560<-kgerev()+125
<-kgerec5()+60<-sss_xcpt_EvalFilterEx()+1869<-sss_xcpt_EvalFilter()+174<-.1.4_5+59<-00007FF85F46C92F<-00007FF85F47D82D
<-00007FF85F42916B<-00007FF85F47C9EE<-kcbs_reset_pool()+3135<-kcbs_resize_pool()+105<-kcbw_deferred_complete()+87
<-kmgs_deferred_initialize()+674<-ksbabs()+1086<-ksbrdp()+1286<-opirip()+853<-opidrv()+909<-sou2o()+98
<-opimai_real()+299<-opimai()+191<-BackgroundThreadStart()+693<-00007FF85DBA16AD<-00007FF85F454629
Wed Aug 09 13:25:49 2023
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_mman_1192.trc  (incident=1203256):
ORA-07445: exception encountered:core dump [kcbs_reset_pool()+3135] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x1350B61] [UNABLE_TO_READ]
Incident details in: e:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_1203256\orcl_mman_1192_i1203256.trc
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0xECE0BF, kcbzre1()+2811]
Wed Aug 09 13:25:49 2023
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_p004_2480.trc  (incident=1203384):
ORA-00600: internal error code, arguments: [2037], [12693108], [2926837760], [72], [217], [11], [2837053633], [33622528], [], [], [], []
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0xECE0BF, kcbzre1()+2811]
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_p007_2492.trc  (incident=1203409):
ORA-07445: exception encountered:core dump [kcbzre1()+2811] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0xECE0BF] [UNABLE_TO_READ]
ORA-00600: internal error code, arguments: [2037], [12693108], [2926837760], [72], [217], [11], [2837053633], [33622528], [], [], [], []
Incident details in: e:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_1203385\orcl_p004_2480_i1203385.trc
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0xECE0BF, kcbzre1()+2811]
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0xECE0BF, kcbzre1()+2811]
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_p001_2468.trc  (incident=1203361):
ORA-07445: exception encountered: core dump [kcbzre1()+2811] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0xECE0BF] [UNABLE_TO_READ] 
ORA-00600: internal error code, arguments: [2037], [130229293], [606928896], [227], [99], [11], [4149544899], [100740356], [], [], [], 

屏蔽数据库一致性,强制拉库报ORA-600 2662错误

Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_2076.trc  (incident=1315352):
ORA-00600: 内部错误代码, 参数: [2662], [11], [1088067894], [11], [1088145382], [12583040], [], [], [], [], [], []
Incident details in: e:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_1315352\orcl_ora_2076_i1315352.trc
Trace dumping is performing id=[cdmp_20230809170852]
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_2076.trc:
ORA-00600: 内部错误代码, 参数: [2662], [11], [1088067894], [11], [1088145382], [12583040], [], [], [], [], [], []
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_2076.trc:
ORA-00600: 内部错误代码, 参数: [2662], [11], [1088067894], [11], [1088145382], [12583040], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 2076): terminating the instance due to error 600
Instance terminated by USER, pid = 2076
ORA-1092 signalled during: ALTER DATABASE OPEN...

这种ORA-600 2662的错误比较常见,通过Patch SCN工具一键式解决,参考以前恢复案例:
Patch SCN工具快速解决ORA-600 2662问题
patch_scn


实现数据库open成功,并顺利导出数据

amdu参数详解

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:amdu参数详解

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

最近发现amdu命令比以前认知中的强大,记录下相关参数

[oracle@xifenfei ~]$ amdu help=y
a/usize         AU size for corrupt disks
-ausize <bytes>: This option must be set when -baddisks is set. It
    must be a power of 2. This size is required to scan a disk looking
    for metadata, and it is normally read from the disk header. The
    value applies to all disks that do not have a valid header. The
    value from the disk header will be used if a valid header is
    found.

ba/ddisks               Include disks with bad headers
-baddisks <diskgroup>:  Normally disks with bad disk headers, or that
    look like they were never part of a disk group, will not be
    scanned. This option forces them to be scanned anyway and to be
    considered part of the given diskgroup. This is most useful when
    a disk header has been damaged. The disk will still need to have
    a valid allocation table to drive the scan unless -fullscan is
    used. In any case at least one block in the first two AUs must be
    valid so that the disk number can be determined. The options
    -ausize and -blksize are required since these values are normally
    fetched from the disk header. If the diskgroup uses external
    redundancy then -external should be specified. These values will
    be compared against any valid disks found in the diskgroup and
    they must be the same.

bl/ksize                ASM block size for corrupt disks
-blksize <bytes>: This option must be set when -baddisks is set. It
    must be a power of 2. This size is required to scan a disk looking
    for metadata, and it is normally read from the disk header. The
    value applies to all disks that do not have a valid header. The
    value from the disk header will be used if a valid header is
    found.

c/ompare                Compare file mirrors
-compare: This option only applies to file extraction from a normal or
    high redundancy disk group. Every extent that is mirrored on more
    than one discovered disk will have all sides of its mirror
    compared. If they are not identical a message will be reported
    on standard error and the report file. The message will indicate
    which copy was extracted. A count of the blocks that are not
    identical will be in the report file.

dir/ectory              Directory from previous dump
-directory <string>: This option completely eliminates the discovery
    phase of operation. It specifies the name of a dump directory from
    a previous run of AMDU. The report file and map files are read
    instead of doing a discovery and scan. The parsing of these ASCII
    files is very dependent on them being exactly as written by AMDU.
    AMDU is unlikely to work properly if they have been modified by
    a text editor, or if some of the files are missing or truncated.
    Note that the directory may be a copy FTPed from another
    machine. The other machine may even be a different platform
    with a different endianess.

dis/kstring             Diskstring for discovery
 -diskstring <string>: By default the null string is used for
    discovery. The null string should discover all disks the user has
    access to. Many installations specify an asm_diskstring parameter
    for their ASM instance. If so that parameter value should be given
    here. Multiple discovery strings can be specified by multiple
    occurrences of -diskstring <string>. Beware of shell syntax
    conflicts with discovery strings. Diskstrings are usually the same
    syntax the shell uses for expanding path names on command lines so
    they will most likely need to be enclosed in single quotes.

du/mp           Diskgroups to dump
-dump <diskgroup>: This option specifies the name of a diskgroup to
    have its metadata dumped. This option may be specified multiple
    times to dump multiple diskgroups. If the diskgroup name is ALL
    then all diskgroups encountered will be dumped. The diskgroup name
    is not case sensitive, but will be converted to uppercase for all
    reports. If this option is not specified then no map or image
    files will be created, but -extract and -print may still work.

exc/lude                Disks to exclude
-exclude <string>: Multiple exclude options may be specified. These
    strings are used for discovery just like the values for diskstring.
    Only shallow discovery is done on these diskstrings. Any disks
    found in the exclude discovery will not be accessed. If they are
    also discovered using the -diskstring strings, then the report will
    include the information from shallow discovery along with a message
    indicating the disk was excluded.

exte/rnal               Assume external redundancy
-external: Normally AMDU determines the diskgroup redundancy from the
    disk headers. However this is not possible with the -baddisks
    option. It is assumed that the redundancy of the -baddisks
    diskgroup is normal or high unless this option is given to specify
    external redundancy.

extr/act                Files to extract
-extract <diskgroup>.<file_number>: This extracts the numbered file
    from the named diskgroup, case insensitive. This option may be
    specified multiple times to extract multiple files. The extracted
    file is placed in the dump directory under the name
    <diskgroup>_<number>.f  where <diskgroup> is the diskgroup name
    in uppercase, and <number> is the file number. The -output option
    may be used to write the file to any location. The extracted file
    will appear to have the same contents it would have if accessed
    through the database. If some portion of the file is unavailable
    then that portion of the output file will be filled with
    0xBADFDA7A, and a message will appear on stderr.

fi/ledump               Dump files rather than extract
-filedump: This option causes the file objects in the command line to
    have their blocks dumped to the image files rather than extracted.
    This can be combined with the -novirtual option to selectively
    dump only some of the metadata files. It may also be used to dump
    user files (number >= 256) so that all mirrored copies can be
    examined.

fo/rmer         Include dropped disks
-former: Normally disks marked as former are not scanned, but this
    option will scan them and include their contents in the output.
    This is useful when it is necessary to look at the contents of a
    disk that was dropped. Note that dropped normal disks will not have
    any entries in their allocation tables and thus only the physically
    addressed extents will be dumped. Force dropped disks will not have
    status former in their disk headers and are not affected by this
    option. However if DROP DISKGROUP is used, the disks will have the
    contents as of the time of the drop, and will be in status former.
    Thus this option is useful for extracting files from a dropped
    diskgroup.

fu/llscan               Scan entire disk
-fullscan: This option reads every AU on the disk and looks at the
    contents of the AU rather than limiting the AU's read based on the
    allocation table. This is useful when the allocation table is
    corrupt or needs recovery. An AU will be written to the image file
    if it starts with a block that contains a valid ASM block header.
    The file and extent information for the map will be extracted from
    the block header. Physically addressed metadata will be dumped
    regardless of its contents. This option is incompatible with
    extracting a file. It is an error to specify -extract with this
    option. Note that this option is likely to find old garbage
    metadata in unallocated AU's since there is no means of
    determining what is allocated. Thus there may be many different
    copies of the same block, possibly of different versions.

h/ex            Always print block contents in hex
-hex: This prints the block contents in hex without attempting to print
    them as ASM metadata. This is useful when the block is known to not
    be ASM metadata. It avoids the ASM block header dump and ensures
    the block is not accidentally interpreted as ASM metadata. This
    option requires at least one -print option.

noa/cd          Do not dump ACD
-noacd: This option limits the dumping of the Active Change Directory
    to just the control blocks that contain the checkpoint. There is
    126 MB of ACD per ASM instance (42 MB for external redundancy). It
    is normally of no interest if there has been a clean shutdown or
    no updates for a while. This option avoids dumping a lot of
    unimportant data. The blocks will still be read and checked for
    corruption. The map file will still contain entries for the ACD
    extents, but the block counts will be zero.

nod/ir          Do not create a dump directory
-nodir: No dump directory is created, and no files are created in it.
    The directory name is not written to standard out. The report file
    is written to standard out before any block printouts from any
    -print options.  This option conflicts with -filedump. It is an
    error to specify this and extract a file to the dump directory.

noe/xtract              Do not create extracted file
-noextract: This prevents files from being extracted to an output
    file, but the file will be read and any errors in selecting the
    correct output will be reported. This is most useful in
    combination with the -compare option.

noh/eart                Do not check for heartbeat
-noheart: Normally the heartbeat block will be saved at discovery time
    and checked when the disk is scanned. A sleep is added between
    discovery and scanning to ensure there is time for the heartbeat
    to be written. If the heartbeat block changes then it is most
    likely that the diskgroup containing this disk is mounted by an
    active ASM instance. An error and warning is generated but
    operation proceeds normally. This option suppresses this check
    and avoids the sleep.

noi/mage                Do not create image files
-noimage: No image files will be created n the dump directory. All
    the reads specified by the read options will still be done. The
    map files may be used to find blocks on the disks themselves. In
    the map file, the count of blocks dumped, the image file sequence
    number, and the byte offset in the image file will all always be
    zero (C00000 S0000 B0000000000)

nom/ap          Do not create map or image files
-nomap: No map file is created and no image file is created. The only
    output is the report file. The -noimage option is assumed if this
    is set since an image file without a map is useless. The options
    -noscan and -noread also result in no map or image files, but
    -nomap still reads the metadata to check for I/O errors and corrupt
    blocks.

nop/rint                Do no print block contents
-noprint: This suppresses the printout of the block contents for
    blocks printed with the -print option. It is useful for getting
    just the block reports without a lot of data. This option requires
    at least one -print option.

norea/d         Shallow discovery only
-noread: This eliminates any reading of any disks at all. Only shallow
    discovery will be done. The report will end after the discovery
    section. It is an error to specify this option and specify a file
    to extract or blocks to print. It is an error to specify this
    and -fullscan.

norep/ort               Do not generate a report
-noreport: This suppresses the generation of the report file. It is
    most useful in combination with -nodir and -print to get block
    printouts without a lot of clutter. It is unnecessary to include
    this with -directory since no report is generated then anyway.

nosc/an         Deep discovery only
-noscan: This eliminates any reading of any disks after deep
    discovery. This results in just doing a deep discovery using the
    disksting parameter. The report will end after the discovery
    section. It is an error to specify this option and specify a file
    to extract. It is an error to specify this and -fullscan.

nosu/bdir               Do not create a dump directory
-nosubdir: No dump directory is created, but files are still created.
    The directory name is not written to standard out. The report file
    and any other dump or extract  files are written to the current
    directory or to the directory indicated by -parentdir. This means
    that if multiple AMDU dumps are requested using this option, the
    report file will always correspond to the last dump requested.

nov/irtual              Do not dump virtual metadata
-novirtual: This option eliminates reading of any virtual metadata.
    Only the physically addressed metadata will be read. This
    implicitly eliminates the ACD and extent maps so -noacd and
    -noxmap will be assumed.

nox/map         Do not dump extent maps
-noxmap: This option eliminates reading of the indirect extents
    containing the file extent maps. This is the bulk of the metadata
    in most diskgroups. Even the entries in the map file will be
    eliminated.

o/utput         Files to create for extract
-output <file_name>: This option specifies a different file for
    writing an extracted file. The file will be overwritten if it
    already exists. This option requires that exactly one file is
    extracted via the -extract option.

pa/rent         Parent for dump directory
-parent <path_name>: By default the dump directory is created in the
    current directory, but another directory can be specified using
    this option. The parent directory for the dump directory must
    already exist.

pr/int          Block to print
-print <block_spec>: This option prints one or more blocks to standard
    out. This option may be specified multiple times to print multiple
    <block_spec>s. The printout contains information about how each
    block was read as well as a formatted printout. Multiple blocks
    matching the same <block_spec> may be found when scanning the
    disks. For example there may be multiple disks that have headers
    for the same diskgroup and disk number. If the block is from a
    mirrored file then multiple copies should exist on different disks.
    If multiple copies of the same block have identical contents then
    only one formatted printout of the contents will be generated, but
    a header will be printed for each copy. A <block_spec> may include
    a count of sequential blocks to print. A <block_spec> may specify
    a block either by disk or file.
   <block_spec> ::= <single_block> | <single_block>.C<count>
   <single_block> ::= <report_disk_block> | <group_disk_block> |
        <extent_file_block> | <virtual_file_block> | <xmap_file_block>
   <report_disk_block> ::=
        <group_name>.N<report_number>.A<au_number>.B<block_number>
   <group_disk_block> ::=
         <group_name>.D<disk_number>.A<au_number>.B<block_number>
   <extent_file_block> ::=
         <group_name>.F<file_number>.X<physical_extent>.B<block_number>
   <virtual_file_block> ::= 
         <group_name>.F<file_number>.V<virtual_block_number>
   <xmap_file_block> ::=
         <group_name>.F<file_number>.M<extent_map_block_number>

r/egistry               Dump registry files
-registry: The ASM registries will be read and dumped to the image
    file. There will be no block consistency checks since these files
    do not have ASM cache headers. To dump one specific registry
    specify -filedump and include the file object for the registry
    (e.g. DATA.255).

s/pfile         Extract usable spfile
-spfile: This causes extract to render the resulting file in a form   
    that is directly usable by startup. Without this option, AMDU   
    will extract the file as a regular ASM file including all ASM   
    specific headers and such

ORA-600 kghstack_underflow_internal_2

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 kghstack_underflow_internal_2

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

aix平台运行11.2.0.4 rac,突然一个节点crash,lms2进程报ORA-600 kghstack_underflow_internal_2错误

Thu Aug 03 18:43:16 2023
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc  (incident=761244):
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/oracle/app/oracle/diag/rdbms/xff/xff2/incident/incdir_761244/xff2_lms2_2884404_i761244.trc
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc  (incident=761245):
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11AB5BBF0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/oracle/app/oracle/diag/rdbms/xff/xff2/incident/incdir_761245/xff2_lms2_2884404_i761245.trc
Thu Aug 03 18:43:19 2023
Dumping diagnostic data in directory=[cdmp_20230803184319], requested by (instance=2, osid=2884404 (LMS2)), summary=[incident=761245].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Thu Aug 03 18:43:23 2023
Sweep [inc][761245]: completed
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc:
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
Sweep [inc][761244]: completed
Sweep [inc2][761245]: completed
Sweep [inc2][761244]: completed
Thu Aug 03 18:43:29 2023
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc:
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
LMS2 (ospid: 2884404): terminating the instance due to error 484

分析trace文件中的Call Stack Trace信息

----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
skdstdst()+40        bl       0000000109B3EE38     000000000 ? 000000001 ?
                                                   000000003 ? 000000000 ?
                                                   000000000 ? 000000001 ?
                                                   000000003 ? 000000000 ?
ksedst1()+112        call     skdstdst()           1777D9901C4FD34D ?
                                                   4840284100000000 ?
                                                   FFFFFFFFFFECE20 ?
                                                   2A501377F67A7 ? 10A742204 ?
                                                   000000000 ? 1107486C0 ?
                                                   2050033FFFECE28 ?
ksedst()+40          call     ksedst1()            FFFFFFFFFFFE0002 ?
                                                   0000060F1 ? 000000001 ?
                                                   10A46AD18 ? 000000000 ?
                                                   000000000 ? 000002004 ?
                                                   000000001 ?
dbkedDefDump()+1516  call     ksedst()             000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 300000003 ?
ksedmp()+72          call     dbkedDefDump()       3107486C0 ? 110000A28 ?
                                                   FFFFFFFFFFED630 ? 1106ABC70 ?
                                                   100125778 ? FFFFFFFFFFED5B0 ?
                                                   FFFFFFFFFFEDA30 ? 1106ABC70 ?
ksfdmp()+100         call     ksedmp()             000000002 ? 000000000 ?
                                                   000000002 ? 10AF71A68 ?
                                                   10A0720F8 ? 000000000 ?
                                                   1108EC608 ? 1107486C0 ?
dbgexPhaseII()+1904  call     ksfdmp()             FFFFFFFFFFFE0002 ?
                                                   0000060F1 ? 000000002 ?
                                                   000000000 ? 000000002 ?
                                                   10A0720F0 ? 000000000 ?
                                                   001050005 ?
dbgexProcessError()  call     dbgexPhaseII()       1107486C0 ? 1108EFB28 ?
+1556                                              0000B9D9D ? 200000000 ?
                                                   FFFFFFFFFFEE548 ? 000000104 ?
                                                   FFFFFFFFFFEDBB0 ?
                                                   FB400000000 ?
dbgeExecuteForError  call     dbgexProcessError()  1107486C0 ? 1108EC608 ?
()+72                                              100000000 ? 000000000 ?
                                                   FFFFFFFFFFF29E0 ?
                                                   2840288000000012 ?
                                                   10013DA4C ? 1108EE350 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  000000002 ? 000000128 ?
2044                          ()                   FFFFFFFFFFFE0002 ?
                                                   215265335E5162 ?
                                                   3726000000000001 ?
                                                   10A46AD18 ? 10A46CB00 ?
                                                   FFFFFFFFFFF1D30 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   000000001 ? 10A46AD18 ?
68                                                 25800000000 ? 109E7A740 ?
                                                   000000000 ? 000000038 ?
                                                   FFFFFFFFFFF2800 ? 11AB1AC50 ?
kgeadse()+380        call     dbkePostKGE_kgsf()   900000000512C74 ?
                                                   9001000A008DAD0 ? 000000000 ?
                                                   9001000A008DAD0 ?
                                                   8000000FFFF2C40 ?
                                                   7000147E8F28C98 ? 400000008 ?
                                                   1100054A0 ?
kgerinv_internal()+  call     kgeadse()            7FFFFFFFFFFFFFFF ?
48                                                 FFFFFFFFFFFEF8FF ?
                                                   000000019 ? 110476528 ?
                                                   000000001 ? 000000017 ?
                                                   00000000B ? 000000000 ?
kgerinv()+48         call     kgerinv_internal()   FFFFFFFFFFFEF8FF ?
                                                   FFFFFFFFFFFFFFFF ?
                                                   FFFFFFFFFFFFFFFF ?
                                                   7FFFFFFFFFFFFFFF ?
                                                   1001648E0 ? FFFFFFFFFFF25E0 ?
                                                   1106ABC70 ? 11073B3C0 ?
kgeasnmierr()+72     call     kgerinv()            000000000 ? 215265335E5162 ?
                                                   372600383A0F5000 ?
                                                   000000004 ? 10A328F7C ?
                                                   FFFFFFFFFFF2898 ? 000000002 ?
                                                   0FFFFFFFF ?
kghstack_underflow_  call     kgeasnmierr()        11AB967A0 ? 000000000 ?
internal()+280                                     FFFFFFFFFFF2860 ? 100000001 ?
                                                   000000002 ? 11AB5BBF0 ?
                                                   000000000 ? 11AB96778 ?
kghstack_free()+716  call     kghstack_underflow_  10A328F7C ? 110A2FEC0 ?
                              internal()           000000004 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000080 ? 80000000000000 ?
ktudda()+912         call     kghstack_free()      11AB5BBF0 ? 7215265335E5162 ?
                                                   3726000000000008 ?
                                                   000000102 ? 109E747E0 ?
                                                   FFFFFFFFFFF2A90 ? 000000048 ?
                                                   28408880FFFFFFFF ?
kcbtdu()+1636        call     ktudda()             70001383A0F4014 ? 000000000 ?
                                                   1FE800000000 ? 07F7F7F7F ?
                                                   FFFFFFFF80808080 ?
                                                   000000000 ? 000000030 ?
                                                   FFFFFFFFFFF2B30 ?
kcbzdh()+3200        call     kcbtdu()             35900000359 ? 100000001 ?
                                                   000000001 ? 200000001 ?
                                                   000000001 ? 00000005D ?
                                                   200066665D20 ? 000000000 ?
kcbzpnd()+504        call     kcbzdh()             70001383F6D64B8 ? 000002004 ?
                                                   2107486C0 ? 10A74269E ?
                                                   1107486C0 ? FFFFFFFFFFF3B30 ?
                                                   FFFFFFFFFFF38E0 ? 000000000 ?
kcbdnb()+724         call     kcbzpnd()            10A74267C ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 0001CE860 ?
                                                   000000000 ? 000000000 ?
dbkedDefDump()+5528  call     kcbdnb()             200000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   1100224D0 ? 000000018 ?
                                                   110001366 ? 000000000 ?
ksedmp()+72          call     dbkedDefDump()       3107486C0 ? 110000A28 ?
                                                   FFFFFFFFFFF3FC0 ? 1106ABC70 ?
                                                   100125778 ? 000000000 ?
                                                   FFFFFFFFFFF3FB0 ? 1106ABC70 ?
ksfdmp()+100         call     ksedmp()             000000002 ? 000000000 ?
                                                   000000002 ? 10AF71A68 ?
                                                   10A0720F8 ? 000000000 ?
                                                   1109DE650 ? 1107486C0 ?
dbgexPhaseII()+1904  call     ksfdmp()             11074B65C ? 000000001 ?
                                                   000000002 ? 000000000 ?
                                                   000000002 ? 10A0720F0 ?
                                                   000000000 ? 001050005 ?
dbgexProcessError()  call     dbgexPhaseII()       1107486C0 ? 1109DC860 ?
+1556                                              0000B9D9C ? 200000000 ?
                                                   FFFFFFFFFFF4ED8 ? 000000082 ?
                                                   FFFFFFFFFFF4560 ?
                                                   88A4422A00000000 ?
dbgeExecuteForError  call     dbgexProcessError()  1107486C0 ? 1109DE650 ?
()+72                                              100000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   0DFFFFFFF ? 1109E0398 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  00000000A ? 000000000 ?
2044                          ()                   000000001 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFFB4E0 ? 000000000 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   000000000 ? FFFFFFFFFFF96B0 ?
68                                                 2580000000A ? 109E7A740 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFF9190 ? 11AB1AC50 ?
kgeadse()+380        call     dbkePostKGE_kgsf()   000000001 ? 000000008 ?
                                                   000000000 ? 10A30EA38 ?
                                                   110000C20 ? 700014771160D68 ?
                                                   700014772ADB3A8 ? 000000001 ?
kgerinv_internal()+  call     kgeadse()            000000003 ? 000000000 ?
48                                                 11074B65C ? 000000001 ?
                                                   000000000 ? FFFFFFFFFFF96B0 ?
                                                   00000000A ? 000000001 ?
kgerinv()+48         call     kgerinv_internal()   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
kgeasnmierr()+72     call     kgerinv()            000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFF92B0 ?
                                                   48102840FFFFA5B0 ?
                                                   11AB5BBB8 ? 11074D658 ?
kghstack_underflow_  call     kgeasnmierr()        022028200 ? 022202820 ?
internal()+280                                     11AB5BBB8 ? 100000001 ?
                                                   000000002 ? 11074D658 ?
                                                   0442C2394 ? 000002000 ?
kghstack_free()+716  call     kghstack_underflow_  FFFFFFFFFFF92B0 ?
                              internal()           FFFFFFFFFFF95B8 ?
                                                   FFFFFFFFFFF92B0 ? 000000001 ?
                                                   FFFFFFFFFFF92B0 ?
                                                   FFFFFFFFFFF95E8 ?
                                                   FFFFFFFFFFF95B8 ? 11074B650 ?
ktundo()+924         call     kghstack_free()      0DEADBEEF ? 11074D668 ?
                                                   11074B654 ? 300000000 ?
                                                   1FFFFB4E0 ? FFFFFFFFFFFB4E0 ?
                                                   FFFFFFFFFFF94C0 ?
                                                   FFFFFFFFFFF9470 ?
kturCRBackoutOneChg  call     ktundo()             19FFFFB5E0 ?
()+848                                             494CEDB3FFFF9E50 ?
                                                   FFFFFFFFFFF9E48 ? 000000000 ?
                                                   000000000 ? FFFFFFFFFFFA5B0 ?
                                                   100000000 ? FFFFFFFFFFFB4E0 ?
ktrgcm()+5816        call     kturCRBackoutOneChg  FFFFFFFFFFFA5B0 ?
                              ()                   19FFFFA440 ?
                                                   FFFFFFFFFFFA5B8 ? 000000000 ?
                                                   1FFFFA478 ? FFFFFFFFFFFB4E0 ?
                                                   000000000 ? 000000000 ?
ktrget3()+832        call     ktrgcm()             FFFFFFFFFFFAC80 ? 000000000 ?
                                                   000000000 ? 000000003 ?
                                                   058F7501F ? 000000001 ?
                                                   000000004 ? 000000003 ?
ktrget2()+104        call     ktrget3()            000000002 ? 700000000014488 ?
                                                   7000147E9C41A50 ? 000000022 ?
                                                   110A123A0 ? 000000000 ?
                                                   FFFFFFFFFFFB080 ? 110A123B8 ?
kclgeneratecr()+654  call     ktrget2()            FFFFFFFFFFFB4D0 ? 110AA1610 ?
0                                                  14F11E4E00 ? 0F11E4E00 ?
                                                   357FED028 ? 000030000 ?
                                                   7000147E9C41A50 ?
                                                   700000000014488 ?
kclgcr()+812         call     kclgeneratecr()      11A209508 ? FFFFFFFFFFFBFC0 ?
                                                   FFFFFFFFFFFBC18 ? 000000000 ?
                                                   0FFFFBB10 ? 01A275AC8 ?
                                                   1761D7F302ED25AC ?
                                                   20000011A275AC8 ?
kclcrrf()+536        call     kclgcr()             FFFFFFFFFFFBC20 ?
                                                   FFFFFFFFFFFBD00 ? 101F5080C ?
                                                   000000000 ? 0000003E8 ?
                                                   000000028 ? 0000000C8 ?
                                                   FFFFFFFFFFFBF88 ?
kjblcrcbk()+896      call     kclcrrf()            000000001 ? 000000000 ?
                                                   7000147EB0F07B8 ?
                                                   7000147576C4471 ?
                                                   401472C30C7F0 ?
                                                   7000147576C4408 ?
                                                   7000147576C3190 ?
                                                   7000147576C7170 ?
kjblpcr()+304        call     kjblcrcbk()          FFFFFFFFFFFBDA8 ? 000000038 ?
                                                   7000147FABBDB48 ? 600000006 ?
                                                   000000016 ? 11A209468 ?
                                                   000000013 ? 0001C2153 ?
kjbmpbast()+1792     call     kjblpcr()            000000012 ? 000000168 ?
                                                   000000002 ? 70001109FDB8148 ?
                                                   357000000000357 ?
                                                   7000144F31F7750 ?
                                                   895000000000895 ? 000000000 ?
kjmxmpm()+760        call     kjbmpbast()          1000000000000 ? 80000001E ?
                                                   000000000 ? 11A2951C8 ?
                                                   C000000000 ? 000000000 ?
                                                   1000000000000 ? 000000000 ?
kjmpbmsg()+3508      call     kjmxmpm()            000000000 ? 11A3769E0 ?
                                                   FFFFFFFFFFFC380 ? 06DBFBAEF ?
                                                   101E13820 ? 11A3769E0 ?
                                                   7000147E339AE08 ?
                                                   FFFFFFFFFFFC210 ?
kjmsm()+13416        call     kjmpbmsg()           11A209448 ? 7000147E339AE08 ?
                                                   100000019 ? 100000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 7000000000168FD ?
ksbrdp()+2216        call     kjmsm()              7000000000168E0 ?
                                                   7000000000168FC ? 048244028 ?
                                                   000000E00 ? 1108B69F0 ?
                                                   100637768 ? 000000001 ?
                                                   700000007 ?
opirip()+1620        call     ksbrdp()             FFFFFFFFFFFFE22 ? 10AFA5FC8 ?
                                                   FFFFFFFFFFFDC10 ? 000000000 ?
                                                   000000001 ? 000000000 ?
                                                   01380038F ? 000000001 ?
opidrv()+608         call     opirip()             10AFA23B0 ? 410134118 ?
                                                   FFFFFFFFFFFED80 ?
                                                   2F7530312F ? 108A7E8C4 ?
                                                   1106ABC70 ?
                                                   652F70726F647563 ?
                                                   1106ABC70 ?
sou2o()+136          call     opidrv()             3208A885B0 ? 400000000 ?
                                                   FFFFFFFFFFFED80 ?
                                                   23001801CD0000 ? 000000010 ?
                                                   1106ABC70 ? 000000000 ?
                                                   000000000 ?
opimai_real()+188    call     sou2o()              FFFFFFFFFFFEDF0 ?
                                                   4424444B00000001 ?
                                                   9000000000D73CC ?
                                                   BADC0FFEE0DDF00D ?
                                                   000000003 ? 9001000A008DAD0 ?
                                                   A0000000A000000 ? 10B6A8F30 ?
ssthrdmain()+276     call     opimai_real()        9001000A0011A60 ?
                                                   FFFFFFFFFFFF148 ?
                                                   FFFFFFFFFFFEEF0 ? 10B6E9280 ?
                                                   90000000008582C ?
                                                   9001000A008DAD0 ?
                                                   FFFFFFFFFFFEED0 ?
                                                   9001000A008DAD0 ?
main()+204           call     ssthrdmain()         3F0003660 ? FFFFFFFFFFFF238 ?
                                                   FFFFFFFFFFFF2A0 ?
                                                   9FFFFFFF000D658 ?
                                                   9FFFFFFF00009A0 ? 000000000 ?
                                                   000000000 ? 9FFFFFFF000D658 ?
__start()+112        call     main()               000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
 

--------------------- Binary Stack Dump ---------------------

查询mos对比相关信息,参考: LMON or LMS Process Crashes Instance With ORA-600 [kghstack_underflow_internal_2] (Doc ID 2003278.1)信息

The LMON or LMS process crash the instance with an error like:

ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x110A10838], [], [], [], [], [], [], [], [], [], []

ORA-1092 : opitsk aborting process
Instance terminated by LMS1, pid = 14024818
 

Review of the generated tracefiles reveals a call stack similar to:

... kghstack_underflow_internal kghstack_free kccgrd kjxgrf_rr_read kjxgrDD_rr_read kjxgrimember kjxggpoll kjfmact kjfdact kjfcln ksbrdp ...
  
- OR -
  
... kghstack_underflow_internal kghstack_free ktundo kturcrbackoutonechg ktrgcm ktrget3 ktrget2 kclgcr ...

确认为Bug 18687067 – ORA-600 [KGHSTACK_UNDERFLOW_INTERNAL_2] closed as duplicate of Bug 20675347 – ORA-07445 [KGHSTACK_OVERFLOW_INTERNAL()+644](The bug is caused by an AIX compiler issue causing volatile variables in the Oracle kernel not to be handled properly.),解决方案升级数据库到12.1及其以上版本或者打上patch 20675347

WRH$_LATCH, WRH$_SYSSTAT, WRH$_PARAMETER对象较大

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:WRH$_LATCH, WRH$_SYSSTAT, WRH$_PARAMETER对象较大

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

通过awrinfo查看发现sysaux中以下对象大小属于top N

**********************************
(3b) Space usage within AWR Components (> 500K)
**********************************

COMPONENT        MB SEGMENT_NAME - % SPACE_USED                                           SEGMENT_TYPE
--------- --------- --------------------------------------------------------------------- ---------------
FIXED         136.0 WRH$_PARAMETER_PK.WRH$_PARAME_1600597976_0                    -  68%  INDEX PARTITION
FIXED         128.0 WRH$_LATCH.WRH$_LATCH_1600597976_0                            -  98%  TABLE PARTITION
FIXED         104.0 WRH$_PARAMETER.WRH$_PARAME_1600597976_0                       -  97%  TABLE PARTITION
FIXED          88.0 WRH$_SYSSTAT_PK.WRH$_SYSSTA_1600597976_0                      -  99%  INDEX PARTITION
FIXED          88.0 WRH$_SYSSTAT.WRH$_SYSSTA_1600597976_0                         -  90%  TABLE PARTITION
FIXED          80.0 WRH$_LATCH_PK.WRH$_LATCH_1600597976_0                         -  99%  INDEX PARTITION

查新mos发现类似文档:WRH$_LATCH, WRH$_SYSSTAT, and WRH$_PARAMETER Consume the Majority of Space within SYSAUX (Doc ID 2099998.1)
对应的bug为:Bug 14084247 – ORA-1555 or ORA-12571 Failed AWR purge can lead to continued SYSAUX space use (Doc ID 14084247.8)
处理操作

SQL> SELECT COUNT(1) HOW_MANY
  2  FROM sys.WRH$_PARAMETER a
  3  WHERE NOT EXISTS
  4  (SELECT 1
  5  FROM sys.wrm$_snapshot
  6  WHERE snap_id = a.snap_id
  7  AND dbid = a.dbid
  8  AND instance_number = a.instance_number
  9  );

  HOW_MANY
----------
   2406788

SQL> DELETE FROM sys.WRH$_LATCH a
  2  WHERE NOT EXISTS
  3  (SELECT 1
  4  FROM sys.wrm$_snapshot b
  5  WHERE b.snap_id = a.snap_id
  6  AND dbid=(SELECT dbid FROM v$database)
  7  AND b.dbid = a.dbid
  8  AND b.instance_number = a.instance_number);

已删除2411808行。

SQL>
SQL> DELETE FROM sys.WRH$_SYSSTAT a
  2  WHERE NOT EXISTS
  3  (SELECT 1
  4  FROM sys.wrm$_snapshot b
  5  WHERE b.snap_id = a.snap_id
  6  AND dbid=(SELECT dbid FROM v$database)
  7  AND b.dbid = a.dbid
  8  AND b.instance_number = a.instance_number);

已删除2747472行。

SQL>
SQL> DELETE FROM sys.WRH$_PARAMETER a
  2  WHERE NOT EXISTS
  3  (SELECT 1
  4  FROM sys.wrm$_snapshot b
  5  WHERE b.snap_id = a.snap_id
  6  AND dbid=(SELECT dbid FROM v$database)
  7  AND b.dbid = a.dbid
  8  AND b.instance_number = a.instance_number);

已删除2406788行。

SQL>
SQL> COMMIT;

提交完成。

SQL> ALTER TABLE WRH$_LATCH ENABLE ROW MOVEMENT;

表已更改。

SQL> ALTER TABLE WRH$_LATCH SHRINK SPACE COMPACT;

表已更改。

SQL> ALTER TABLE WRH$_LATCH SHRINK SPACE;

表已更改。

SQL> ALTER TABLE WRH$_LATCH SHRINK SPACE CASCADE;

表已更改。

SQL>
SQL> ALTER TABLE WRH$_PARAMETER ENABLE ROW MOVEMENT;

表已更改。

SQL> ALTER TABLE WRH$_PARAMETER SHRINK SPACE COMPACT;

表已更改。

SQL> ALTER TABLE WRH$_PARAMETER SHRINK SPACE;

表已更改。

SQL> ALTER TABLE WRH$_PARAMETER SHRINK SPACE CASCADE;

表已更改。

SQL> ALTER TABLE WRH$_SYSSTAT ENABLE ROW MOVEMENT;

表已更改。

SQL> ALTER TABLE WRH$_SYSSTAT SHRINK SPACE COMPACT;

表已更改。

SQL> ALTER TABLE WRH$_SYSSTAT SHRINK SPACE;

表已更改。

SQL> ALTER TABLE WRH$_SYSSTAT SHRINK SPACE CASCADE;

表已更改。

SQL> ALTER TABLE WRH$_SYSSTAT disable ROW MOVEMENT;

表已更改。

SQL> ALTER TABLE WRH$_PARAMETER disable ROW MOVEMENT;

表已更改。

SQL> ALTER TABLE WRH$_LATCH disable ROW MOVEMENT;

表已更改。

再次查看这些TOP对象消失

**********************************
(3b) Space usage within AWR Components (> 500K)
**********************************

COMPONENT        MB SEGMENT_NAME - % SPACE_USED                                           SEGMENT_TYPE
--------- --------- --------------------------------------------------------------------- ---------------
FIXED          56.0 WRH$_SERVICE_STAT_PK.WRH$_SERVIC_1600597976_0                 -  64%  INDEX PARTITION
FIXED          29.0 WRH$_SERVICE_STAT.WRH$_SERVIC_1600597976_0                    -  95%  TABLE PARTITION
FIXED          26.0 WRH$_ROWCACHE_SUMMARY.WRH$_ROWCAC_1600597976_0                -  96%  TABLE PARTITION
FIXED          21.0 WRH$_MVPARAMETER.WRH$_MVPARA_1600597976_0                     -  95%  TABLE PARTITION
FIXED          17.0 WRH$_ROWCACHE_SUMMARY_PK.WRH$_ROWCAC_1600597976_0             -  98%  INDEX PARTITION
FIXED          17.0 WRH$_MVPARAMETER_PK.WRH$_MVPARA_1600597976_0                  -  97%  INDEX PARTITION
FIXED          12.0 WRH$_SYSMETRIC_HISTORY                                        -  45%  TABLE

Oracle 19C 报ORA-704 ORA-01555故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle 19C 报ORA-704 ORA-01555故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

异常断电导致数据库无法启动,尝试对数据文件进行recover操作,报ORA-00283 ORA-00742 ORA-00312错误,由于redo写丢失无法正常应用

D:\check_db>sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on 星期日 7月 30 07:49:19 2023
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


连接到:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> recover datafile 1;
ORA-00283: 恢复会话因错误而取消
ORA-00742: 日志读取在线程 1 序列 9274 块 18057 中检测到写入丢失情况
ORA-00312: 联机日志 1 线程 1: 'D:\APP\ADMINISTRATOR\ORADATA\XFF\REDO01.LOG'

屏蔽数据一致性,尝试强制打开库,报ORA-00604,ORA-00704,ORA-01555错误

SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 9 with name
"_SYSSMU9_4165470211$" too small
进程 ID: 4036
会话 ID: 2277 序列号: 40707

alert日志对应错误

2023-07-30T06:54:43.457383+08:00
.... (PID:5836): Clearing online redo logfile 1 complete
.... (PID:5836): Clearing online redo logfile 2 complete
.... (PID:5836): Clearing online redo logfile 3 complete
Resetting resetlogs activation ID 3572089731 (0xd4e9c383)
Online log D:\APP\ADMINISTRATOR\ORADATA\XFF\REDO01.LOG: Thread 1 Group 1 was previously cleared
Online log D:\APP\ADMINISTRATOR\ORADATA\XFF\REDO02.LOG: Thread 1 Group 2 was previously cleared
Online log D:\APP\ADMINISTRATOR\ORADATA\XFF\REDO03.LOG: Thread 1 Group 3 was previously cleared
2023-07-30T06:54:43.863676+08:00
Setting recovery target incarnation to 2
2023-07-30T06:54:44.816771+08:00
Ping without log force is disabled:
  instance mounted in exclusive mode.
Endian type of dictionary set to little
2023-07-30T06:54:44.957395+08:00
Assigning activation ID 3664275149 (0xda6866cd)
2023-07-30T06:54:44.957395+08:00
TT00 (PID:4640): Gap Manager starting
2023-07-30T06:54:45.004305+08:00
Redo log for group 1, sequence 1 is not located on DAX storage
2023-07-30T06:54:46.176153+08:00
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\XFF\REDO01.LOG
Successful open of redo thread 1
2023-07-30T06:54:46.191771+08:00
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
stopping change tracking
2023-07-30T06:54:46.223036+08:00
TT03 (PID:1816): Sleep 5 seconds and then try to clear SRLs in 2 time(s)
2023-07-30T06:54:46.332398+08:00
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 
0x0000000017b852a7
):
2023-07-30T06:54:46.332398+08:00
select ctime, mtime, stime from obj$ where obj# = :1
2023-07-30T06:54:46.332398+08:00
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_ora_5836.trc:
ORA-00704: 引导程序进程失败
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01555: 快照过旧: 回退段号 9 (名称为 "_SYSSMU9_4165470211$") 过小
2023-07-30T06:54:46.332398+08:00
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_ora_5836.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01555: 快照过旧: 回退段号 9 (名称为 "_SYSSMU9_4165470211$") 过小
2023-07-30T06:54:46.348028+08:00
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_ora_5836.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01555: 快照过旧: 回退段号 9 (名称为 "_SYSSMU9_4165470211$") 过小
Error 704 happened during db open, shutting down database
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_ora_5836.trc  (incident=474502):
ORA-00603: ORACLE 服务器会话因致命错误而终止
ORA-01092: ORACLE 实例终止。强制断开连接
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01555: 快照过旧: 回退段号 9 (名称为 "_SYSSMU9_4165470211$") 过小
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\incident\incdir_474502\xff_ora_5836_i474502.trc
2023-07-30T06:54:47.785549+08:00
opiodr aborting process unknown ospid (5836) as a result of ORA-603
2023-07-30T06:54:47.816792+08:00
ORA-603 : opitsk aborting process
License high water mark = 6
USER (ospid: (prelim)): terminating the instance due to ORA error 

这类错误比较常见,参考以前类似恢复:
在数据库open过程中常遇到ORA-01555汇总
数据库open过程遭遇ORA-1555对应sql语句补充
Oracle Recovery Tools恢复—ORA-00704 ORA-01555故障
使用_allow_resetlogs_corruption导致ORA-00704/ORA-01555故障
对于本次故障,通过Oracle Recovery Tools工具快速处理
patch


open数据库成功

SQL> alter database open;

数据库已更改。

SQL>
SQL>
SQL> select status,count(1) from v$datafile group by status;

STATUS           COUNT(1)
-------------- ----------
SYSTEM                  1
ONLINE                 61

Exadata磁盘损坏导致磁盘组无法mount恢复(oracle一体机磁盘组异常恢复)

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Exadata磁盘损坏导致磁盘组无法mount恢复(oracle一体机磁盘组异常恢复)

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

Oracle Exadata客户,在换盘过程中,cell节点又一块磁盘损坏,导致datac1磁盘组(该磁盘组是normal方式冗余)无法mount

Thu Jul 20 22:01:21 2023
SQL> alter diskgroup datac1 mount force 
NOTE: cache registered group DATAC1 number=1 incarn=0x0728ad12
NOTE: cache began mount (first) of group DATAC1 number=1 incarn=0x0728ad12
NOTE: Assigning number (1,35) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_11_dm01celadm03)
NOTE: Assigning number (1,31) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_07_dm01celadm03)
NOTE: Assigning number (1,24) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_00_dm01celadm03)
NOTE: Assigning number (1,25) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_01_dm01celadm03)
NOTE: Assigning number (1,27) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_03_dm01celadm03)
NOTE: Assigning number (1,33) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_09_dm01celadm03)
NOTE: Assigning number (1,30) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_06_dm01celadm03)
NOTE: Assigning number (1,28) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_04_dm01celadm03)
NOTE: Assigning number (1,26) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_02_dm01celadm03)
NOTE: Assigning number (1,1) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_08_dm01celadm03)
NOTE: Assigning number (1,34) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_10_dm01celadm03)
NOTE: Assigning number (1,29) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_05_dm01celadm03)
NOTE: Assigning number (1,3) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_07_dm01celadm02)
NOTE: Assigning number (1,4) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_06_dm01celadm02)
NOTE: Assigning number (1,5) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_00_dm01celadm02)
NOTE: Assigning number (1,6) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_10_dm01celadm02)
NOTE: Assigning number (1,7) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_08_dm01celadm02)
NOTE: Assigning number (1,8) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_03_dm01celadm02)
NOTE: Assigning number (1,9) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_11_dm01celadm02)
NOTE: Assigning number (1,10) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_01_dm01celadm02)
NOTE: Assigning number (1,11) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_04_dm01celadm02)
NOTE: Assigning number (1,21) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_05_dm01celadm02)
NOTE: Assigning number (1,43) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_02_dm01celadm02)
NOTE: Assigning number (1,36) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_07_dm01celadm01)
NOTE: Assigning number (1,37) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_09_dm01celadm01)
NOTE: Assigning number (1,38) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_11_dm01celadm01)
NOTE: Assigning number (1,0) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_08_dm01celadm01)
NOTE: Assigning number (1,40) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_00_dm01celadm01)
NOTE: Assigning number (1,41) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_03_dm01celadm01)
NOTE: Assigning number (1,42) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_06_dm01celadm01)
NOTE: Assigning number (1,44) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_05_dm01celadm01)
NOTE: Assigning number (1,45) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_01_dm01celadm01)
NOTE: Assigning number (1,46) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_02_dm01celadm01)
NOTE: Assigning number (1,47) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_10_dm01celadm01)
NOTE: Assigning number (1,2) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_04_dm01celadm01)
Thu Jul 20 22:01:28 2023
NOTE: GMON heartbeating for grp 1
GMON querying group 1 at 450 for pid 30, osid 171838
NOTE: Assigning number (1,32) to disk ()
NOTE: Assigning number (1,39) to disk ()
GMON querying group 1 at 451 for pid 30, osid 171838
NOTE: cache closing disk 32 of grp 1: (not open) 
NOTE: process _user171838_+asm1 (171838) 
     initiating offline of disk 39.3915945266 () with mask 0x7e[0x7f] in group 1
NOTE: initiating PST update: grp = 1, dsk = 39/0xe9689532, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 452 for pid 30, osid 171838
NOTE: cache closing disk 32 of grp 1: (not open) 
ERROR: Disk 39 cannot be offlined, since all the disks [39, 32] with mirrored data would be offline.
ERROR: too many offline disks in PST (grp 1)
WARNING: Offline for disk  in mode 0x7f failed.
NOTE: cache dismounting (not clean) group 1/0x0728AD12 (DATAC1) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 171838, image: oracle@dm01dbadm01.gyzq.cn (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0x0728AD12 (DATAC1) 
NOTE: cache ending mount (fail) of group DATAC1 number=1 incarn=0x0728ad12
NOTE: cache deleting context for group DATAC1 1/0x0728ad12
NOTE: cache closing disk 32 of grp 1: (not open) 
GMON dismounting group 1 at 453 for pid 30, osid 171838
NOTE: Disk DATAC1_CD_08_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_08_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_04_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_07_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_06_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_00_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_10_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_08_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_03_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_11_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_01_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_04_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_05_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_00_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_01_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_02_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_03_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_04_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_05_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_06_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_07_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk  in mode 0x1 marked for de-assignment
NOTE: Disk DATAC1_CD_09_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_10_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_11_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_07_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_09_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_11_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk  in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_00_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_03_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_06_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_02_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_05_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_01_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_02_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_10_DM01CELADM01 in mode 0x7f marked for de-assignment
ERROR: diskgroup DATAC1 was not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15066: offlining disk "39" in group "DATAC1" may result in a data loss
ORA-15042: ASM disk "39" is missing from group number "1" 
ORA-15042: ASM disk "32" is missing from group number "1" 
ERROR: alter diskgroup datac1 mount force

故障原因是由于asm disk 32还已经损坏在换盘过程中(数据没有reblance完成),又损坏了asm disk 39,而这两份磁盘中有数据互为镜像,因此磁盘组无法正常mount起来.

检查cell节点celldisk和griddisk情况,确认底层磁盘损坏
cellcli


对于这种情况,因为normal冗余的两份数据都有部分丢失,无法直接恢复数据,通过底层磁盘级别恢复(参考以前一次的Oracle exadata故障恢复:Oracle Exadata坏盘导致磁盘组无法mount恢复),然后比较顺利恢复数据,实现业务数据0丢失

SQL> alter datac1 mount;

Diskgroup altered.

SQL> alter diskgroup datac1 check all;

Diskgroup altered.

多套库顺利open成功
20230728124241


在实际恢复过程中由于客户进行了各种尝试,直接新镜像盘然后插入新盘,强制拉磁盘组drop异常disk操作等,导致第一现场发生一些破坏,增加了恢复难道,但是最终通过各种方法弥补,实现了预期的恢复效果(业务数据0丢失)

.faust加密勒索数据库恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:.faust加密勒索数据库恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户的win服务器被勒索病毒加密,里面运行有用友系统的Oracle数据库,加密提示为(camry2020@aol.com):
20230725154649


加密的数据文件类似(.DBF.id[0E564ACA-3493].[camry2020@aol.com].faust):
20230725165741

通过工具检测发现少量block被加密破坏
20230725214725

对于这种级别的损坏,可以通过我开发的Oracle数据文件勒索加密恢复工具直接重构文件头
20230725214929

然后直接open数据库,并且导出数据,实现数据库非常完美的恢复(这个是目前除直接解密之外最好的恢复效果,没有之一)
20230725215102

对于类似这种被加密的勒索的数据文件,我们可以实现比较好的恢复效果,如果此类的数据库(oracle,mysql,sql server)等被加密,需要专业恢复技术支持,请联系我们:
电话/微信:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com
系统安全防护措施建议:
1.多台机器,不要使用相同的账号和口令
2.登录口令要有足够的长度和复杂性,并定期更换登录口令
3.重要资料的共享文件夹应设置访问权限控制,并进行定期备份
4.定期检测系统和软件中的安全漏洞,及时打上补丁。
5.定期到服务器检查是否存在异常。
6.安装安全防护软件,并确保其正常运行。
7.从正规渠道下载安装软件。
8.对不熟悉的软件,如果已经被杀毒软件拦截查杀,不要添加信任继续运行。
9.保存良好的备份习惯,尽量做到每日备份,异地备份。