又一例asm disk 加入vg故障

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:又一例asm disk 加入vg故障

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

又一客户把asm disk加入到vg,并且扩容到lv中
asm-disk-vg


通过asm层面查看
20220914191351
20220914191418

ASMCMD> lsdsk
Path
/dev/asmdisk/asm-arch01
/dev/asmdisk/asm-data01
/dev/asmdisk/asm-ocr01
/dev/asmdisk/asm-ocr02
/dev/asmdisk/asm-ocr03
dbsrv2-> ls -l /dev/asm*
/dev/asm:
total 0

/dev/asmdisk:
total 0
lrwxrwxrwx 1 root root 6 Sep 14 17:39 asm-arch01 -> ../sdb
lrwxrwxrwx 1 root root 6 Sep 14 17:35 asm-data01 -> ../sda
lrwxrwxrwx 1 root root 6 Sep 11 09:11 asm-ocr01 -> ../sdc
lrwxrwxrwx 1 root root 6 Sep 11 09:11 asm-ocr02 -> ../sdd
lrwxrwxrwx 1 root root 6 Sep 11 09:11 asm-ocr03 -> ../sde

对于这类情况,由于客户的系统是ext4,根据这个文件系统特性,每隔2G会有一点破坏,最终数据恢复效果看运气,运气好直接通过元数据恢复出来所有数据文件,然后open库,然后不好可能需要底层碎片等,参见类似恢复:
asm disk被加入vg恢复
asm disk 磁盘部分被清空恢复
文件系统重新分区oracle恢复
删除分区 oracle asm disk 恢复
pvcreate asm disk导致asm磁盘组异常恢复
对于使用asm的客户,在对文件系统进行操作时,一定要注意asm disk,别弄错磁盘(把asm disk磁盘给误操作掉了),适当情况下linux平台可以考虑AFD(ASM FILTER DRIVER)

oracle启动报ORA-600 kdBlkCheckError故障解决

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:oracle启动报ORA-600 kdBlkCheckError故障解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报ORA-600 kdBlkCheckError错误

SQL> alter database open ;
alter database open 
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [3], [144], [38504]
进程 ID: 17516
会话 ID: 14 序列号: 5

根据ORA-600 kdBlkCheckError的经验,这个错误是3号文件的144号block逻辑不一致导致.通过dbv检查该文件

Microsoft Windows [版本 10.0.19044.1949]
(c) Microsoft Corporation。保留所有权利。

C:\Users\XFF>dbv file=H:\BaiduNetdisk\oradata\XFF\UNDOTBS1.DBF

DBVERIFY: Release 11.2.0.4.0 - Production on 星期六 9月 17 10:51:32 2022

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - 开始验证: FILE = H:\BAIDUNETDISK\ORADATA\SPECTRA\UNDOTBS1.DBF
页 112 失败, 校验代码为 18018
Block Checking: DBA = 12583056, Block Type = System Managed Segment Header Block
ERROR: SMU Segment Header Corrupted.  Error Code = 38504
ktu4smck: SCN commited txn list is not sorted.
  previous txn slot=23, scn=0x0000.ee917d05
  offending txn slot=18, scn=0x0000.ee916272
  TRN CTL:: seq: 0x0c3f chd: 0x0017 ctl: 0x0018 inc: 0x00000000 nfb: 0x0001
            mgc: 0xb000 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00c087c8.0c3f.05 scn: 0x0000.ee9160d2
            Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00c087c8.0c3f.05  ext: 0xe  spc: 0x1594
    uba: 0x00000000.0bfe.03  ext: 0x12 spc: 0x1eb8
    uba: 0x00000000.0b20.04  ext: 0x4  spc: 0x1d2e
    uba: 0x00000000.c6e5.01  ext: 0x2  spc: 0x1f84
    uba: 0x00000000.0000.00  ext: 0x0  spc: 0x0
  TRN TBL::
  index  state cflags  wrap#    uel         scn            dba            parent-xid    nub       bcl     cmt
  -----------------------------------------------------------------------------------------
   0x00  9 0x00  0x62878b  0xffe4  0x0000.ee917f13  0x00c087d7  0x0000.000.00000000  0x00000001 0x00000000  1663031369
   0x01 10 0x80  0x62887e  0x001f  0x0000.ee917efe  0x00c087d6  0x0000.000.00000000  0x00000001 0x00000000  38
   0x02  9 0x00  0x628871  0xffde  0x0000.ee917cce  0x00c087d7  0x0000.000.00000000  0x00000001 0x00000000  1663031373
   0x03  9 0x00  0x62865f  0x0000  0x0000.ee916279  0x00c087d7  0x0000.000.00000000  0x00000001 0x00000000  1663031370
   0x04  9 0x00  0x628823  0x0011  0x0000.ee916083  0x00c087d4  0x0000.000.00000000  0x00000001 0x00000000  1663031371
   0x05  9 0x00  0x6287b3  0x0000  0x0000.ee917e56  0x00c087d7  0x0000.000.00000000  0x00000001 0x00000000  1663031370
   0x06  9 0x00  0x628893  0x001c  0x0000.ee916288  0x00002338  0x0000.000.00000000  0x00000000 0x00000000  1663031371
   0x07  9 0x00  0x628820  0x0011  0x0000.ee917f66  0x00c087d7  0x0000.000.00000000  0x00000001 0x00000000  1663031373
   0x08  9 0x00  0x628833  0x000c  0x0000.ee917eaf  0x00c087d4  0x0000.000.00000000  0x00000001 0x00000000  1663031371
   0x09  9 0x00  0x628815  0x0000  0x0000.ee917e72  0x00c087d7  0x0000.000.00000000  0x00000001 0x00000000  1663031370
   0x0a  9 0x00  0x628863  0x0002  0x0000.ee917cc1  0x00c087d7  0x0000.000.00000000  0x00000001 0x00000000  1663031373
   0x0b  9 0x00  0x62870c  0x0008  0x0000.ee916085  0x00c087d4  0x0000.000.00000000  0x00000001 0x00000000  1663031372
   0x0c  9 0x00  0x62881a  0x0005  0x0000.ee917ff2  0x00c087d7  0x0000.000.00000000  0x00000001 0x00000000  1663031370
   0x0d  9 0x00  0x6289fb  0x000f  0x0000.ee917d1d  0x00c087d4  0x0000.000.00000000  0x00000001 0x00000000  1663031375
   0x0e  9 0x00  0x6287d8  0x000a  0x0000.ee91638a  0x00c087d6  0x0000.000.00000000  0x00000001 0x00000000  1663031371
   0x0f  9 0x00  0x62880c  0x001b  0x0000.ee91619e  0x00003003  0x0000.000.00000000  0x00000000 0x00000000  1663031370
   0x10  9 0x00  0x6287e6  0x0013  0x0000.ee9161fc  0x00c087d5  0x0000.000.00000000  0x00000001 0x00000000  1663031370
   0x11  9 0x00  0x62863b  0x0019  0x0000.ee916354  0x00c087d5  0x0000.000.00000000  0x00000001 0x00000000  1663031370
   0x12  9 0x00  0x6287d4  0x0010  0x0000.ee916272  0x00c087d6  0x0000.000.00000000  0x00000001 0x00000000  1663031371
   0x13  9 0x00  0x628470  0x0007  0x0000.ee9160b2  0x00c087d5  0x0000.000.00000000  0x00000001 0x00000000  1663031370
   0x14  9 0x00  0x6287a4  0x001e  0x0000.ee91627f  0x00c087d6  0x0000.000.00000000  0x00000001 0x00000000  1663031372
   0x15  9 0x00  0x628797  0x000a  0x0000.ee9162bb  0x00c087c9  0x0000.000.00000000  0x00000001 0x00000000  1663031368
   0x16  9 0x00  0x6287ad  0x0005  0x0000.ee917f6c  0x00c087d4  0x0000.000.00000000  0x00000001 0x00000000  1663031371
   0x17  9 0x00  0x6287b5  0x0012  0x0000.ee917d05  0x00c087d7  0x0000.000.00000000  0x00000001 0x00000000  1663031373
   0x18  9 0x00  0x628719  0x000b  0x0000.ee916136  0x00c087d4  0x0000.000.00000000  0x00000001 0x00000000  1663031373
   0x19  9 0x00  0x628783  0x0006  0x0000.ee916363  0x00c087d5  0x0000.000.00000000  0x00000001 0x00000000  1663031370
   0x1a  9 0x00  0x6287d8  0xffff  0x0000.ee917d97  0x00c087cb  0x0000.000.00000000  0x00000001 0x00000000  1663031375
   0x1b  9 0x00  0x6287d7  0x0022  0x0000.ee916043  0x00c087d5  0x0000.000.00000000  0x00000001 0x00000000  1663031373
   0x1c  9 0x00  0x62880e  0x0005  0x0000.ee917db7  0x00002338  0x0000.000.00000000  0x00000000 0x00000000  1663031373
   0x1d  9 0x00  0x6287b7  0x0003  0x0000.ee9161e1  0x00c087d4  0x0000.000.00000000  0x00000001 0x00000000  1663031373
   0x1e  9 0x00  0x6287f6  0x0015  0x0000.ee9162e6  0x00002338  0x0000.000.00000000  0x00000000 0x00000000  1663031368
   0x1f  9 0x00  0x6287ad  0x0003  0x0000.ee917eae  0x00003003  0x0000.000.00000000  0x00000000 0x00000000  1663031372
   0x20  9 0x00  0x6287b0  0x0003  0x0000.ee9163a5  0x0000133b  0x0000.000.00000000  0x00000000 0x00000000  1663031368
   0x21  9 0x00  0x62886a  0x0001  0x0000.ee916056  0x00c087d5  0x0000.000.00000000  0x00000001 0x00000000  1663031373
  EXT TRN CTL::
  usn: 2
  sp1:0x00000000 sp2:0x00000000 sp3:0x00000000 sp4:0x00000000
  sp5:0x00000000 sp6:0x00000000 sp7:0x00000000 sp8:0x00000000
  EXT TRN TBL::
index extflag  extHash  extSpare1  extSpare2
---------------------------------------------
   0x00  0x00000000 0x00000000 0x00000000  0x00000000
   0x01  0x00000000 0x00000000 0x00000000  0x00000000
   0x02  0x00000000 0x00000000 0x00000000  0x00000000
   0x03  0x00000000 0x00000000 0x00000000  0x00000000
   0x04  0x00000000 0x00000000 0x00000000  0x00000000
   0x05  0x00000000 0x00000000 0x00000000  0x00000000
   0x06  0x00000000 0x00000000 0x00000000  0x00000000
   0x07  0x00000000 0x00000000 0x00000000  0x00000000
   0x08  0x00000000 0x00000000 0x00000000  0x00000000
   0x09  0x00000000 0x00000000 0x00000000  0x00000000
   0x0a  0x00000000 0x00000000 0x00000000  0x00000000
   0x0b  0x00000000 0x00000000 0x00000000  0x00000000
   0x0c  0x00000000 0x00000000 0x00000000  0x00000000
   0x0d  0x00000000 0x00000000 0x00000000  0x00000000
   0x0e  0x00000000 0x00000000 0x00000000  0x00000000
   0x0f  0x00000000 0x00000000 0x00000000  0x00000000
   0x10  0x00000000 0x00000000 0x00000000  0x00000000
   0x11  0x00000000 0x00000000 0x00000000  0x00000000
   0x12  0x00000000 0x00000000 0x00000000  0x00000000
   0x13  0x00000000 0x00000000 0x00000000  0x00000000
   0x14  0x00000000 0x00000000 0x00000000  0x00000000
   0x15  0x00000000 0x00000000 0x00000000  0x00000000
   0x16  0x00000000 0x00000000 0x00000000  0x00000000
   0x17  0x00000000 0x00000000 0x00000000  0x00000000
   0x18  0x00000000 0x00000000 0x00000000  0x00000000
   0x19  0x00000000 0x00000000 0x00000000  0x00000000
   0x1a  0x00000000 0x00000000 0x00000000  0x00000000
   0x1b  0x00000000 0x00000000 0x00000000  0x00000000
   0x1c  0x00000000 0x00000000 0x00000000  0x00000000
   0x1d  0x00000000 0x00000000 0x00000000  0x00000000
   0x1e  0x00000000 0x00000000 0x00000000  0x00000000
   0x1f  0x00000000 0x00000000 0x00000000  0x00000000
   0x20  0x00000000 0x00000000 0x00000000  0x00000000
   0x21  0x00000000 0x00000000 0x00000000  0x00000000
页 144 失败, 校验代码为 38504

…………

DBVERIFY - 验证完成

检查的页总数: 161280
处理的页总数 (数据): 0
失败的页总数 (数据): 0
处理的页总数 (索引): 0
失败的页总数 (索引): 0
处理的页总数 (其他): 161277
处理的总页数 (段)  : 9
失败的总页数 (段)  : 0
空的页总数: 1
标记为损坏的总页数: 4
流入的页总数: 2
加密的总页数        : 0
最高块 SCN            : 4002695098 (0.4002695098)

C:\Users\XFF>

可以确认是由于SMU Segment Header异常,导致数据库无法正常启动,通过数据库层面设置,规避数据库启动访问该block,数据库正常启动正常,并顺利导出数据

Thu Sep 15 11:02:23 2022
ALTER DATABASE   MOUNT
Successful mount of redo thread 1, with mount id 2863639551
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Thu Sep 15 11:02:31 2022
alter database open upgrade
Thread 1 opened at log sequence 660107
  Current log# 2 seq# 660107 mem# 0: H:\BAIDUNETDISK\ORADATA\XFF\REDO02.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Undo initialization finished serial:0 start:74439375 end:74439375 diff:0 (0 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Completed: alter database open 

Oracle Recovery Tools 解决ORA-600 3020故障

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle Recovery Tools 解决ORA-600 3020故障

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

尝试recover datafile,部分文件报ORA-600 3020,其他文件recover成功

ALTER DATABASE RECOVER  datafile 1  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 3 Seq 24972 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_72232.trc  (incident=749532):
ORA-00600: 内部错误代码, 参数: [3020], [1], [272255], [4466559], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 1, block# 272255, file offset is 2230312960 bytes)
ORA-10564: tablespace SYSTEM
ORA-01110: 数据文件 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 383
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 1  ...
Tue Aug 02 10:28:24 2022
Trace dumping is performing id=[cdmp_20220802102824]
Tue Aug 02 10:28:31 2022
ALTER DATABASE RECOVER  datafile 2  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 3 Seq 24972 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_72232.trc  (incident=749533):
ORA-00600: 内部错误代码, 参数: [3020], [2], [92323], [8480931], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 92323, file offset is 756310016 bytes)
ORA-10564: tablespace SYSAUX
ORA-01110: 数据文件 2: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSAUX01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 12330
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 2  ...

利用Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)检查文件头相关信息,发现recover 失败的两个文件异常
20220802163502


通过Oracle Recovery Tools工具进行修复
20220802105543

数据库recover 成功,并顺利open
20220802105622

Tue Aug 02 10:56:13 2022
ALTER DATABASE RECOVER  datafile 1  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 3 Seq 24972 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG
Completed: ALTER DATABASE RECOVER  datafile 1  
ALTER DATABASE RECOVER  datafile 2  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 3 Seq 24972 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG
Completed: ALTER DATABASE RECOVER  datafile 2  
Tue Aug 02 10:56:34 2022
alter database open 
Beginning crash recovery of 1 threads
 parallel recovery started with 7 processes
Started redo scan
Completed redo scan
 read 8504 KB redo, 0 data blocks need recovery
Started redo application at
 Thread 1: logseq 24972, block 2, scn 177712270
Recovery of Online Redo Log: Thread 1 Group 3 Seq 24972 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG
Completed redo application of 0.00MB
Completed crash recovery at
 Thread 1: logseq 24972, block 17011, scn 177734679
 0 data blocks read, 0 data blocks written, 8504 redo k-bytes read
Tue Aug 02 10:56:35 2022
Thread 1 advanced to log sequence 24973 (thread open)
Thread 1 opened at log sequence 24973
  Current log# 1 seq# 24973 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Aug 02 10:56:35 2022
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 2.
Dictionary check beginning
Tablespace 'TEMP' #3 found in data dictionary,
but not in the controlfile. Adding to controlfile.
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
         This condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
 
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
 
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
**********************************************************
WARNING: Files may exists in db_recovery_file_dest
that are not known to the database. Use the RMAN command
CATALOG RECOVERY AREA to re-catalog any such files.
If files cannot be cataloged, then manually delete them
using OS command.
One of the following events caused this:
1. A backup controlfile was restored.
2. A standby controlfile was restored.
3. The controlfile was re-created.
4. db_recovery_file_dest had previously been enabled and
   then disabled.
**********************************************************
replication_dependency_tracking turned off (no async multimaster replication found)
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Completed: alter database open 

增加tempfile,导出数据该库恢复完成

又一例ORA-600 kcratr_nab_less_than_odr

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:又一例ORA-600 kcratr_nab_less_than_odr

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报错ORA-600 kcratr_nab_less_than_odr

alter database open
Sat Jul 23 21:38:32 2022
Beginning crash recovery of 1 threads
 parallel recovery started with 19 processes
Sat Jul 23 21:38:33 2022
Started redo scan
Sat Jul 23 21:38:33 2022
Completed redo scan
 read 244 KB redo, 64 data blocks need recovery
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_5748.trc  (incident=309845):
ORA-00600: ??????, ??: [kcratr_nab_less_than_odr], [1], [10343], [67442], [67454], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:34 2022
Slave encountered ORA-10388 exception during crash recovery
Sat Jul 23 21:38:38 2022
Aborting crash recovery due to error 600
Sat Jul 23 21:38:38 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_5748.trc:
ORA-00600: ??????, ??: [kcratr_nab_less_than_odr], [1], [10343], [67442], [67454], [], [], [], [], [], [], []
Sat Jul 23 21:38:39 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_5748.trc:
ORA-00600: ??????, ??: [kcratr_nab_less_than_odr], [1], [10343], [67442], [67454], [], [], [], [], [], [], []
ORA-600 signalled during: alter database open...

这个错误比较简单,参考:
12c启动报kcratr_nab_less_than_odr
ORA-600 kcratr_nab_less_than_odr故障解决
解决该问题之后,数据库启动报ORA-600 4194错误

Mon Jul 25 12:18:04 2022
SMON: enabling tx recovery
Starting background process SMCO
Mon Jul 25 12:18:05 2022
SMCO started with pid=26, OS id=8164 
Mon Jul 25 12:18:06 2022
Database Characterset is ZHS16GBK
ORA-00600: ??????, ??: [4194], [46], [44], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Jul 25 12:18:14 2022
Doing block recovery for file 5 block 1267
Mon Jul 25 12:18:14 2022
Resuming block recovery (PMON) for file 5 block 1267
Block recovery from logseq 1, block 67 to scn 217083444
Mon Jul 25 12:18:15 2022
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL12C\REDO01.LOG
Block recovery stopped at EOT rba 1.68.16
Block recovery completed at rba 1.68.16, scn 0.217083444
Doing block recovery for file 5 block 272
Resuming block recovery (PMON) for file 5 block 272
Block recovery from logseq 1, block 67 to scn 217083443
Mon Jul 25 12:18:18 2022
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL12C\REDO01.LOG
Block recovery completed at rba 1.68.16, scn 0.217083444
Mon Jul 25 12:18:19 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_smon_7100.trc:
ORA-01595: 释放区 (5) 回退段 (10) 时出错
ORA-00600: ??????, ??: [4194], [46], [44], [], [], [], [], [], [], [], [], []
Mon Jul 25 12:18:19 2022
No Resource Manager plan active
Mon Jul 25 12:18:23 2022
Sweep [inc][317000]: completed
Sweep [inc2][317000]: completed
Starting background process FBDA
Mon Jul 25 12:18:40 2022
FBDA started with pid=28, OS id=7828 
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_5804.trc  (incident=317056):
ORA-00600: 内部错误代码, 参数: [4194], [46], [44], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Jul 25 12:18:53 2022
Doing block recovery for file 5 block 1267
Resuming block recovery (PMON) for file 5 block 1267
Block recovery from logseq 1, block 67 to scn 217083444
Mon Jul 25 12:18:53 2022
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL12C\REDO01.LOG
Block recovery completed at rba 1.68.16, scn 0.217083454
Doing block recovery for file 5 block 272
Resuming block recovery (PMON) for file 5 block 272
Block recovery from logseq 1, block 67 to scn 217083454
Mon Jul 25 12:18:54 2022
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL12C\REDO01.LOG
Block recovery completed at rba 1.69.16, scn 0.217083455
Mon Jul 25 12:18:55 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_5804.trc:
ORA-00600: 内部错误代码, 参数: [4194], [46], [44], [], [], [], [], [], [], [], [], []
Mon Jul 25 12:18:55 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_5804.trc:
ORA-00600: 内部错误代码, 参数: [4194], [46], [44], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 5804): terminating the instance due to error 600
Mon Jul 25 12:19:07 2022
Instance terminated by USER, pid = 5804
ORA-1092 signalled during: alter database open resetlogs...

该错误也比较简单,对异常undo段进行处理即可,参考类似操作:How to resolve ORA-600 [4194] errors

存储重启,oracle无法启动故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:存储重启,oracle无法启动故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户由于机房要停电,正常关闭两个节点数据库,通过数据库alert日志均可看到类似如下记录,证明数据库确实是正常shutdown immediate
1
2
然后关闭存储,启动存储之后发现数据库无法正常启动(数据scn不一致).相关信息如下:
20220707175525


最初报ORA-214错

that ORACLE_BASE be set in the environment
Wed Jul 06 00:50:02 2022
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1521))' SCOPE=MEMORY SID='xffdb2';
ALTER DATABASE MOUNT /* db agent *//* {1:42392:203} */
This instance was first to mount
NOTE: Loaded library: System 
SUCCESS: diskgroup DATA1 was mounted
SUCCESS: diskgroup DATA2 was mounted
ORA-214 signalled during: ALTER DATABASE MOUNT /* db agent *//* {1:42392:203} */...
NOTE: dependency between database xffdb and diskgroup resource ora.DATA1.dg is established
NOTE: dependency between database xffdb and diskgroup resource ora.DATA2.dg is established

提示ctl不存在,通过处理之后报ORA-600 2131错误

Wed Jul 06 01:55:45 2022
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1521))' SCOPE=MEMORY SID='xffdb2';
ALTER DATABASE MOUNT /* db agent *//* {1:42392:663} */
This instance was first to mount
NOTE: Loaded library: System 
SUCCESS: diskgroup DATA1 was mounted
SUCCESS: diskgroup DATA2 was mounted
NOTE: dependency between database xffdb and diskgroup resource ora.DATA1.dg is established
NOTE: dependency between database xffdb and diskgroup resource ora.DATA2.dg is established
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb2/trace/xffdb2_ora_47746.trc  (incident=576488):
ORA-00600: internal error code, arguments: [2131], [33], [32], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/xffdb/xffdb2/incident/incdir_576488/xffdb2_ora_47746_i576488.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORA-600 signalled during: ALTER DATABASE MOUNT /* db agent *//* {1:42392:663} */...

重建控制文件后恢复报错

Parallel Media Recovery started with 127 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER  database using BACKUP CONTROLFILE  ...
Wed Jul 06 02:41:04 2022
ALTER DATABASE RECOVER    LOGFILE '+DATA3/xffdb/archivelog/2022_07_05/thread_2_seq_40889.18030.1109269215'  
Media Recovery Log +DATA3/xffdb/archivelog/2022_07_05/thread_2_seq_40889.18030.1109269215
Wed Jul 06 02:41:04 2022
Errors with log +DATA3/xffdb/archivelog/2022_07_05/thread_2_seq_40889.18030.1109269215
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_pr00_96503.trc:
ORA-00325: archived log for thread 1, wrong thread # 2 in header
ORA-00334: archived log: '+DATA3/xffdb/archivelog/2022_07_05/thread_2_seq_40889.18030.1109269215'
ORA-325 signalled during: ALTER DATABASE RECOVER    LOGFILE '+DATA3/thread_2_seq_40889.18030.1109269215'  ...
ALTER DATABASE RECOVER CANCEL 
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL 
…………
Wed Jul 06 02:22:25 2022
ALTER DATABASE RECOVER  DATABASE  
Media Recovery Start
 started logmerger process
Only allocated 127 recovery slaves (requested 128)
Parallel Media Recovery started with 127 slaves
Wed Jul 06 02:22:28 2022
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_pr00_77044.trc:
ORA-00313: open failed for members of log group 7 of thread 1
Media Recovery failed with error 313
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_pr00_77044.trc:
ORA-00283: recovery session canceled due to errors
ORA-00313: open failed for members of log group 7 of thread 1
Wed Jul 06 02:22:28 2022
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_m000_77318.trc:
ORA-00322: log 4 of thread 2 is not current copy
ORA-00312: online log 4 thread 2: '+DATA3/xffdb/onlinelog/group_4.16148.1107795635'
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_m000_77318.trc:
ORA-00322: log 7 of thread 1 is not current copy
ORA-00312: online log 7 thread 1: '+DATA3/xffdb/onlinelog/group_7.18959.1107796013'
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_m000_77318.trc:
ORA-00314: log 9 of thread 1, expected sequence# 133495 doesn't match 133490
ORA-00312: online log 9 thread 1: '+DATA3/xffdb/onlinelog/group_9.3142.1107796071'
Checker run found 208 new persistent data failures
ORA-10877 signalled during: ALTER DATABASE RECOVER  DATABASE  ...
…………
Only allocated 127 recovery slaves (requested 128)
Parallel Media Recovery started with 127 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile  ...
Wed Jul 06 06:15:26 2022
ALTER DATABASE RECOVER    LOGFILE '+DATA3/xffdb/onlinelog/group_4.16442.1107795653'  
Media Recovery Log +DATA3/xffdb/onlinelog/group_4.16442.1107795653
ORA-279 signalled during: ALTER DATABASE RECOVER    LOGFILE '+DATA3/xffdb/onlinelog/group_4.16442.1107795653'  ...
Wed Jul 06 06:15:43 2022
ALTER DATABASE RECOVER    LOGFILE '+DATA3/xffdb/onlinelog/group_7.18959.1107796013'  
Media Recovery Log +DATA3/xffdb/onlinelog/group_7.18959.1107796013
Wed Jul 06 06:15:50 2022
Errors with log +DATA3/xffdb/onlinelog/group_7.18959.1107796013
Wed Jul 06 06:15:50 2022
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_pr29_306479.trc  (incident=961030):
ORA-00600: internal error code, arguments: [6102], [13], [17], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jul 06 06:15:54 2022
Sweep [inc][961030]: completed
Sweep [inc2][961030]: completed
Slave exiting with ORA-10562 exception
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_pr29_306479.trc:
ORA-10562: Error occurred while applying redo to data block (file# 159, block# 3591756)
ORA-10564: tablespace LIS
ORA-01110: data file 159: '+DATA1/xffdb/datafile/lis.379.1080445903'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 138875
ORA-00600: internal error code, arguments: [6102], [13], [17], [], [], [], [], [], [], [], [], []
Wed Jul 06 06:15:59 2022
Recovery Slave PR29 previously exited with exception 10562

基于上述情况,很可能是由于存储重启之后,cache或者某些数据没有写入到数据文件和redo中,数据库重启之后redo不是最新的[ORA-00322错误可以证明,],数据文件也需要进行恢复(不是数据库正常关闭之后该有的情况),而且redo和数据文件还不一致[ORA-00600 6102可以证明],对于类似这样的情况,只能尝试强制打开数据库,报ORA-600 2663

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2663], [1393], [4159455578],
[1393], [4160374753], [], [], [], [], [], [], []
Process ID: 357910
Session ID: 1585 Serial number: 7
Wed Jul 06 06:57:25 2022
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_ora_357910.trc  (incident=1056360):
ORA-00600: internal error code, arguments: [2663], [1393],[4159455578],[1393],[4160374753],[], [], [], []
Redo thread 2 internally disabled at seq 1 (CKPT)
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_arc3_360348.trc:
ORA-00600: internal error code, arguments: [ORA_NPI_ERROR],[600], 
  [ORA-00600: internal error code, arguments: [kffbAddBlk04]
Unable to create archive log file '+DATA3'
ARC3: Error 19504 Creating archive log file to '+DATA3'
ARCH: Archival error occurred on a closed thread. Archiver continuing
ORACLE Instance xffdb1 - Archival Error. Archiver continuing.
ARCH: Archival error occurred on a closed thread. Archiver continuing
ORACLE Instance xffdb1 - Archival Error. Archiver continuing.
Wed Jul 06 06:57:34 2022
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/incident/incdir_1056360/xffdb1_ora_357910_i1056360.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '+DATA1/xffdb/onlinelog/group_4.424.1109314453'
ORA-00600: internal error code, arguments: [2663], [1393], [4159455578], [1393], [4160374753], [], [],
Wed Jul 06 06:57:34 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_ora_357910.trc:
ORA-00600: internal error code, arguments: [2663], [1393], [4159455578], [1393], [4160374753], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_ora_357910.trc:
ORA-00600: internal error code, arguments: [2663], [1393], [4159455578], [1393], [4160374753], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 357910): terminating the instance due to error 600
Instance terminated by USER, pid = 357910
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (357910) as a result of ORA-1092
Wed Jul 06 06:57:35 2022
ORA-1092 : opitsk aborting process

该错误比较常见,参考:ORA-600 2663,也可以利用我的Patch_SCN小工具快速解决,后续数据库报ORA-03113错

SQL> alter database open ;
alter database open 
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 369324
Session ID: 1585 Serial number: 1

查看alert日志,确认具体报错为kgegpa

Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Jul 06 07:17:08 2022
SMON: enabling cache recovery
ARC1: Archiving disabled thread 2 sequence 1
Archived Log entry 1 added for thread 1 sequence 1 ID 0x36317f52 dest 1:
Archived Log entry 2 added for thread 1 sequence 2 ID 0x36317f52 dest 1:
Archived Log entry 3 added for thread 2 sequence 1 ID 0x0 dest 1:
Exception [type:SIGSEGV, Address not mapped to object][ADDR:0x4D562123][PC:0x983CDD6,kgegpa()+40][flags: 0x0,count:1]
Exception [type:SIGSEGV, Address not mapped to object][ADDR:0x4D562123][PC:0x983B84A, kgebse()+776][flags: 0x2,count:2]
Exception [type:SIGSEGV, Address not mapped to object][ADDR:0x4D562123][PC:0x983B84A, kgebse()+776][flags: 0x2,count:2]
Wed Jul 06 07:17:11 2022
PMON (ospid: 377647): terminating the instance due to error 397

该问题有过类似的案例通过处理数据库open成功:
在数据库恢复遭遇ORA-07445 kgegpa错误
Exception [type: SIGSEGV, Address not mapped to object] [] [ kgegpa()+36]

误删除分区oracle数据库恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:误删除分区oracle数据库恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

很多时候祸福相依,又一个用户发生类似的事情,数据库故障
ora-600-3020


他们公司内部折腾,然后把数据库open成功,并且也顺利导出来dmp在d盘.然后重新安装系统,结果悲剧发生了,他们在操作过程中给c盘扩容,把d盘删除了,然后以前d盘的部分空间分配给c盘了,但是d盘数据全部消失(以前的数据库文件,最新备份出来的dmp文件).用户给我反馈给一系列操作之后,提醒客户尽可能不要对该磁盘进行任何操作(已经分区200G的c盘和800G未分区的空间),然后通过恢复工具进行分析
20220706202513

运气不错,相关的文件没有被覆盖,并且顺利恢复出来
dmp
20220706215448
20220706215611

运气不错,顺利完成相关恢复,将误操作数据恢复恢复来,再次提醒各位,操作谨慎,切莫因为一时疏忽酿成打错.

ORA-00333 ORA-01595 恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-00333 ORA-01595 恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户反馈数据库异常,查看日志发现asm和db均发生hang住情况(由于环境原因部分日志没有拷贝出来),基于现有情况,无法直接恢复,通过一些工具把asm磁盘组中的数据文件拷贝到文件系统,经过检测无坏块
20220705233329


修改相关路径,尝试recover库

Tue Jul 05 15:05:54 2022
ALTER DATABASE RECOVER  datafile 1  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 2 Group 4 Seq 29973 Reading mem 0
  Mem# 0: E:\ORADATA\GROUP_4.266.822672441
Recovery of Online Redo Log: Thread 1 Group 2 Seq 38422 Reading mem 0
  Mem# 0: E:\ORADATA\GROUP_2.262.822672137
Incomplete read from log member 'E:\ORADATA\GROUP_2.262.822672137'. Trying next member.
Media Recovery failed with error 333
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 1  ...

ORA-00333


错误信息比较明显,在读入redo进行恢复的时候遭遇“ORA-00333: 重做日志读取块 11557 计数 731 出错”错误,从而无法继续恢复.这次故障运气比较好,通过分析v$datafile和v$datafile_header关系
20220705233008

进行一些操作,绕过redo block 11557,顺利recover成功,并且open库

ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Tue Jul 05 15:17:46 2022
Parallel Media Recovery started with 32 slaves
Tue Jul 05 15:17:46 2022
Recovery of Online Redo Log: Thread 2 Group 4 Seq 29973 Reading mem 0
  Mem# 0: E:\ORADATA\GROUP_4.266.822672441
Recovery of Online Redo Log: Thread 1 Group 2 Seq 38422 Reading mem 0
  Mem# 0: E:\ORADATA\GROUP_2.262.822672137
Completed: ALTER DATABASE RECOVER  database  

20220705232246


通过分析alert日志发现有ORA-600 4194错误

QMNC started with pid=58, OS id=15980 
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Tue Jul 05 15:18:24 2022
Tue Jul 05 15:18:24 2022
Block recovery from logseq 38423, block 152 to scn 16218380250500
Recovery of Online Redo Log: Thread 1 Group 1 Seq 38423 Reading mem 0
  Mem# 0: E:\ORADATA\GROUP_1.261.822672135
Block recovery stopped at EOT rba 38423.154.16
Block recovery completed at rba 38423.154.16, scn 3776.583740804
Block recovery from logseq 38423, block 152 to scn 16218380250497
Recovery of Online Redo Log: Thread 1 Group 1 Seq 38423 Reading mem 0
  Mem# 0: E:\ORADATA\GROUP_1.261.822672135
Block recovery completed at rba 38423.154.16, scn 3776.583740804
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\xff\xff1\trace\xff1_smon_5660.trc:
ORA-01595: 释放区 (2) 回退段 (8) 时出错
ORA-00600: 内部错误代码, 参数: [4194], [], [
                                      
Completed: alter database open 

这比较简单,对于异常的undo进行处理即可,然后使用hcheck检查字典一致性

SQL> @e:/oradata/txt/11.txt
HCheck Version 07MAY18 on 05-7月 -2022 16:30:18
----------------------------------------------
Catalog Version 11.2.0.3.0 (1102000300)
db_name: xff

                                   Catalog       Fixed
Procedure Name                     Version    Vs Release    Timestamp
Result
------------------------------ ... ---------- -- ---------- --------------
------
.- LobNotInObj                 ... 1102000300 <=  *All Rel* 07/05 16:30:18 PASS
.- MissingOIDOnObjCol          ... 1102000300 <=  *All Rel* 07/05 16:30:19 PASS
.- SourceNotInObj              ... 1102000300 <=  *All Rel* 07/05 16:30:19 PASS
.- OversizedFiles              ... 1102000300 <=  *All Rel* 07/05 16:30:19 PASS
.- PoorDefaultStorage          ... 1102000300 <=  *All Rel* 07/05 16:30:19 PASS
.- PoorStorage                 ... 1102000300 <=  *All Rel* 07/05 16:30:19 PASS
.- TabPartCountMismatch        ... 1102000300 <=  *All Rel* 07/05 16:30:20 PASS
.- OrphanedTabComPart          ... 1102000300 <=  *All Rel* 07/05 16:30:20 PASS
.- MissingSum$                 ... 1102000300 <=  *All Rel* 07/05 16:30:20 PASS
.- MissingDir$                 ... 1102000300 <=  *All Rel* 07/05 16:30:20 PASS
.- DuplicateDataobj            ... 1102000300 <=  *All Rel* 07/05 16:30:20 PASS
.- ObjSynMissing               ... 1102000300 <=  *All Rel* 07/05 16:30:20 PASS
.- ObjSeqMissing               ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- OrphanedUndo                ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- OrphanedIndex               ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- OrphanedIndexPartition      ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- OrphanedIndexSubPartition   ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- OrphanedTable               ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- OrphanedTablePartition      ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- OrphanedTableSubPartition   ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- MissingPartCol              ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- OrphanedSeg$                ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- OrphanedIndPartObj#         ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- DuplicateBlockUse           ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- FetUet                      ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- Uet0Check                   ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- SeglessUET                  ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- BadInd$                     ... 1102000300 <=  *All Rel* 07/05 16:30:21 PASS
.- BadTab$                     ... 1102000300 <=  *All Rel* 07/05 16:30:22 PASS
.- BadIcolDepCnt               ... 1102000300 <=  *All Rel* 07/05 16:30:22 PASS
.- ObjIndDobj                  ... 1102000300 <=  *All Rel* 07/05 16:30:22 PASS
.- TrgAfterUpgrade             ... 1102000300 <=  *All Rel* 07/05 16:30:22 PASS
.- ObjType0                    ... 1102000300 <=  *All Rel* 07/05 16:30:22 PASS
.- BadOwner                    ... 1102000300 <=  *All Rel* 07/05 16:30:22 PASS
.- StmtAuditOnCommit           ... 1102000300 <=  *All Rel* 07/05 16:30:22 PASS
.- BadPublicObjects            ... 1102000300 <=  *All Rel* 07/05 16:30:22 PASS
.- BadSegFreelist              ... 1102000300 <=  *All Rel* 07/05 16:30:22 PASS
.- BadDepends                  ... 1102000300 <=  *All Rel* 07/05 16:30:22 PASS
.- CheckDual                   ... 1102000300 <=  *All Rel* 07/05 16:30:23 PASS
.- ObjectNames                 ... 1102000300 <=  *All Rel* 07/05 16:30:23 WARN

HCKW-0018: OBJECT name clashes with SCHEMA name (Doc ID 2363142.1)
Schema=BSHRP INDEX=XFF.XFF

.- BadCboHiLo                  ... 1102000300 <=  *All Rel* 07/05 16:30:23 PASS
.- ChkIotTs                    ... 1102000300 <=  *All Rel* 07/05 16:30:24 PASS
.- NoSegmentIndex              ... 1102000300 <=  *All Rel* 07/05 16:30:24 PASS
.- BadNextObject               ... 1102000300 <=  *All Rel* 07/05 16:30:24 PASS
.- DroppedROTS                 ... 1102000300 <=  *All Rel* 07/05 16:30:24 PASS
.- FilBlkZero                  ... 1102000300 <=  *All Rel* 07/05 16:30:24 PASS
.- DbmsSchemaCopy              ... 1102000300 <=  *All Rel* 07/05 16:30:24 PASS
.- OrphanedObjError            ... 1102000300 >  1102000000 07/05 16:30:24 PASS
.- ObjNotLob                   ... 1102000300 <=  *All Rel* 07/05 16:30:24 PASS
.- MaxControlfSeq              ... 1102000300 <=  *All Rel* 07/05 16:30:24 PASS
.- SegNotInDeferredStg         ... 1102000300 >  1102000000 07/05 16:30:25 PASS
.- SystemNotRfile1             ... 1102000300 >   902000000 07/05 16:30:25 PASS
.- DictOwnNonDefaultSYSTEM     ... 1102000300 <=  *All Rel* 07/05 16:30:25 PASS
.- OrphanTrigger               ... 1102000300 <=  *All Rel* 07/05 16:30:25 PASS
.- ObjNotTrigger               ... 1102000300 <=  *All Rel* 07/05 16:30:25 PASS
---------------------------------------
05-7月 -2022 16:30:25  Elapsed: 7 secs
---------------------------------------
Found 0 potential problem(s) and 1 warning(s)
Contact Oracle Support with the output and trace file
to check if the above needs attention or not

PL/SQL 过程已成功完成。

有一个SCHEMA和对象名一样,这个不影响属于正常情况(客户创建了一个用户叫做XFF,然后有创建了一个XFF的对象),该数据库恢复至此基本上晚上,业务可以直接运行,不用做逻辑迁移

云主机快照之后Oracle无法正常启动处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:云主机快照之后Oracle无法正常启动处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

某客户数据库放在x云上面,需要对数据库盘进行扩容,在扩容之前对该盘做了快照,结果没有想到悲剧发生了

[root@xifenfei ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        99G   64G   31G  68% /
devtmpfs         16G     0   16G   0% /dev
tmpfs            16G     0   16G   0% /dev/shm
tmpfs            16G  720K   16G   1% /run
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/vdb        2.0T  1.2T  910G  56% /www/xifenfei
tmpfs           3.2G     0  3.2G   0% /run/user/1004
tmpfs           3.2G     0  3.2G   0% /run/user/0

如上显示,客户的数据文件都放在/dev/vdb中了,但是很不幸,redo文件放在/data中(也就是vda磁盘组中),没有被做快照,结果客户还原vdb快照之后,发现现象如下

SQL> set pages 10000
SQL> set numw 16
SQL> SELECT status,
  2  checkpoint_change#,
  3  checkpoint_time,last_change#,
  4  count(*) ROW_NUM
  5  FROM v$datafile
  6  GROUP BY status, checkpoint_change#, checkpoint_time,last_change#
  7  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS         CHECKPOINT_CHANGE# CHECKPOINT_T     LAST_CHANGE#          ROW_NUM
-------------- ------------------ ------------ ---------------- ----------------
ONLINE                69632585947 04-JUL-22                                   38
SYSTEM                69632585947 04-JUL-22                                    2

SQL> set numw 16
SQL> col CHECKPOINT_TIME for a40
SQL> set lines 150
SQL> set pages 1000
SQL> SELECT status,
  2  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,checkpoint_change#,
  3  count(*) ROW_NUM
  4  FROM v$datafile_header
  5  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss'),fuzzy
  6  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS         CHECKPOINT_TIME                          FUZZY  CHECKPOINT_CHANGE#          ROW_NUM
-------------- ---------------------------------------- ------ ------------------ ----------------
ONLINE         2022-07-04 09:03:24                      YES           69631105424               40

20220704230638


通过上述分析,该库相当数据文件和redo文件之间相差了一段时间数据,而且该库为非归档,基于这种情况,该库只能强制打开,在打开过程中遇到ORA-600 ktpridestroy2错误

SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
SMON: Restarting fast_start parallel rollback
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7332.trc  (incident=41257):
ORA-00600: internal error code, arguments: [ktpridestroy2], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /data/oracle/diag/rdbms/orcl/orcl/incident/incdir_41257/orcl_smon_7332_i41257.trc
Starting background process QMNC
Mon Jul 04 16:31:44 2022
QMNC started with pid=36, OS id=7454 
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7332.trc:
ORA-00600: internal error code, arguments: [ktpridestroy2], [], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 7332): terminating the instance due to error 474
Instance terminated by SMON, pid = 7332

对应trace文件

Dump continued from file: /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7332.trc
ORA-00600: internal error code, arguments: [ktpridestroy2], [], [], [], [], [], [], [], [], [], [], []

========= Dump for incident 41257 (ORA 600 [ktpridestroy2]) ========

*** 2022-07-04 16:31:44.261
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.

----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
skdstdst()+36        call     kgdsdst()            000000000 ? 000000000 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   7FFCD123FE98 ? 000000000 ?
ksedst1()+98         call     skdstdst()           000000000 ? 000000000 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   000000000 ? 000000000 ?
ksedst()+34          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   000000000 ? 000000000 ?
dbkedDefDump()+2736  call     ksedst()             000000000 ? 000000001 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   000000000 ? 000000000 ?
ksedmp()+36          call     dbkedDefDump()       000000003 ? 000000002 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   000000000 ? 000000000 ?
ksfdmp()+64          call     ksedmp()             000000003 ? 000000002 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   000000000 ? 000000000 ?
dbgexPhaseII()+1764  call     ksfdmp()             000000003 ? 000000002 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   000000000 ? 000000000 ?
dbgexProcessError()  call     dbgexPhaseII()       7F3C5D15C6F0 ? 7F3C5A851598 ?
+2279                                              7FFCD1247C88 ? 000000001 ?
                                                   000000000 ? 000000000 ?
dbgeExecuteForError  call     dbgexProcessError()  7F3C5D15C6F0 ? 7F3C5A851598 ?
()+83                                              000000001 ? 000000000 ?
                                                   7FFC00000000 ? 000000000 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  7F3C5D15C6F0 ? 7F3C5A851598 ?
1615                          ()                   000000001 ? 000000001 ?
                                                   000000000 ? 000000000 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   000000000 ? 7F3C5A6C1228 ?
63                                                 000000258 ? 7F3C5A851598 ?
                                                   000000000 ? 000000000 ?
kgeadse()+383        call     dbkePostKGE_kgsf()   00A984C60 ? 7F3C5A6C1228 ?
                                                   000000258 ? 7F3C5A851598 ?
                                                   000000000 ? 000000000 ?
kgerinv_internal()+  call     kgeadse()            00A984C60 ? 7F3C5A6C1228 ?
45                                                 000000258 ? 000000000 ?
                                                   000000000 ? 000000000 ?
kgerinv()+33         call     kgerinv_internal()   00A984C60 ? 7F3C5A6C1228 ?
                                                   D124022000000000 ?
                                                   000000258 ? 000000000 ?
                                                   000000000 ?
kgeasnmierr()+143    call     kgerinv()            00A984C60 ? 7F3C5A6C1228 ?
                                                   D124022000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ?
ktpridestroy()+912   call     kgeasnmierr()        00A984C60 ? 7F3C5A6C1228 ?
                                                   D124022000000000 ?
                                                   000000000 ? 1E0F02D40 ?
                                                   1EC6DA410 ?
ktprw1s()+527        call     ktpridestroy()       D124022000000000 ?
                                                   000000000 ? 1E7A1C2B0 ?
                                                   000000000 ? 1E0F02D40 ?
                                                   1EC6DA410 ?
ktprsched()+197      call     ktprw1s()            D124022000000000 ?
                                                   000000000 ? 1E7A1C2B0 ?
                                                   000000000 ? 1E0F02D40 ?
                                                   1EC6DA410 ?
kturRecoverUndoSegm  call     ktprsched()          D124022000000000 ?
ent()+1057                                         000000000 ? 1E7A1C2B0 ?
                                                   000000000 ? 1E0F02D40 ?
                                                   1EC6DA410 ?
kturRecoverActiveTx  call     kturRecoverUndoSegm  000000000 ? 000000000 ?
ns()+710                      ent()                000000001 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
ktprbeg()+2506       call     kturRecoverActiveTx  000000004 ? 000000000 ?
                              ns()                 000000027 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
ktmmon()+13588       call     ktprbeg()            000000000 ? 000000000 ?
                                                   000000027 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
ktmSmonMain()+201    call     ktmmon()             06002DEC0 ? 000000000 ?
                                                   000000027 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
ksbrdp()+923         call     ktmSmonMain()        06002DEC0 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
opirip()+618         call     ksbrdp()             06002DEC0 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
opidrv()+598         call     opirip()             000000032 ? 000000004 ?
                                                   7FFCD124B658 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
sou2o()+98           call     opidrv()             000000032 ? 000000004 ?
                                                   7FFCD124B658 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
opimai_real()+261    call     sou2o()              7FFCD124B630 ? 000000032 ?
                                                   000000004 ? 7FFCD124B658 ?
                                                   0D124FFFF ? 6200000005 ?
ssthrdmain()+209     call     opimai_real()        000000000 ? 7FFCD124B820 ?
                                                   000000004 ? 7FFCD124B658 ?
                                                   0D124FFFF ? 6200000005 ?
main()+196           call     ssthrdmain()         000000003 ? 7FFCD124B820 ?
                                                   000000001 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
__libc_start_main()  call     main()               000000003 ? 7FFCD124B9C0 ?
+245                                               000000001 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
_start()+36          call     __libc_start_main()  0009C12F0 ? 000000001 ?
                                                   7FFCD124B9B8 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
--------------------- Binary Stack Dump ---------------------

通过分析确认该错误和并行恢复有关系,绕过该错误之后,再次尝试启动库报错为ORA-600 4137

Mon Jul 04 16:33:41 2022
SMON: enabling cache recovery
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7554.trc  (incident=42457):
ORA-00600: internal error code, arguments: [4137], [6.11.21484016], [0], [0], [], [], [], [], [], [], [], []
Incident details in: /data/oracle/diag/rdbms/orcl/orcl/incident/incdir_42457/orcl_smon_7554_i42457.trc
Stopping background process MMNL
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7554.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/data/oracle/oradata/orcl/redo03.log'
ORA-00600: internal error code, arguments: [4137], [6.11.21484016], [0], [0], [], [], [], [], [], [], [], []
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7554.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/data/oracle/oradata/orcl/redo03.log'
ORA-00600: internal error code, arguments: [4137], [6.11.21484016], [0], [0], [], [], [], [], [], [], [], []
ORACLE Instance orcl (pid = 13) - Error 600 encountered while recovering transaction (6, 11).
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7554.trc:
ORA-00600: internal error code, arguments: [4137], [6.11.21484016], [0], [0], [], [], [], [], [], [], [], []

该错误比较常见,一般是由于undo中有异常事务,对异常事务进行处理,数据库open成功,并顺利导入数据到新库中,完成本次数据恢复

ORA-600 2032故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 2032故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户数据库,异常断电之后,数据库运行不稳定(经常性的重启),通过分析发现

Wed Jun 29 01:04:39 2022
Completed: alter database open
Wed Jun 29 01:04:42 2022
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j000_3284.trc:
ORA-12012: error on auto execute of job 1
ORA-01578: ORACLE data block corrupted (file # 2, block # 552)
ORA-01110: data file 2: 'E:\ORACLE\PRODUCT\10.2.0\ORADATA_2\ORCL\UNDOTBS01.DBF'
…………
Wed Jun 29 01:13:28 2022
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Wed Jun 29 01:15:34 2022
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_m000_5488.trc:
ORA-00600: internal error code, arguments: [6002], [6], [48], [5], [0], [], [], []
Wed Jun 29 01:20:54 2022
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 2 out of maximum 100 non-fatal internal errors.
Wed Jun 29 01:20:55 2022
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_6956.trc:
ORA-00600: internal error code, arguments: [2032], [8389160], [8389160], [8192], [2], [255], [0], [767]

Non-fatal internal error happenned while SMON was doing flushing of monitored table stats.
SMON encountered 3 out of maximum 100 non-fatal internal errors.
Wed Jun 29 01:20:57 2022
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_6956.trc:
ORA-00600: internal error code, arguments: [2032], [8389160], [8389160], [8192], [2], [255], [0], [767]
………………
Wed Jun 29 01:21:41 2022
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_q001_2124.trc:
ORA-00474: SMON process terminated with error

Wed Jun 29 01:21:42 2022
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_dbw0_3376.trc:
ORA-00474: SMON process terminated with error

Wed Jun 29 01:21:42 2022
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_reco_2412.trc:
ORA-00474: SMON process terminated with error

Instance terminated by PMON, pid = 7160

对ora-600 2032进行分析

*** 2022-06-29 01:13:26.907
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [2032], [8389160], [8389160], [8192], [2], [255], [0], [767]
Current SQL statement for this session:
insert into smon_scn_time (thread, time_mp, time_dp, scn, scn_wrp, scn_bas,  num_mappings, 
tim_scn_map) values (0, :1, :2, :3, :4, :5, :6, :7)
check trace file e:\oracle\product\10.2.0\db_2\rdbms\trace\orcl_ora_0.trc for preloading .sym file messages
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedmp+663           CALL???  ksedst+55            003C878B8 000000000 00FF57178
                                                   000000000
ksfdmp+19            CALL???  ksedmp+663           000000003 00ED07680 010453DA8
                                                   003CACC80
kgeriv+184           CALL???  ksfdmp+19            7FF2378C000 7FF5E2F81D8
                                                   7FF5EBB9300 7FF5EBB92F0
kgesiv+102           CALL???  kgeriv+184           000000002 00ED07040 000000002
                                                   00FF59110
ksesic7+125          CALL???  kgesiv+102           7FFFFFFF00800228 100000002
                                                   200000001 00FF59328
kcopcv+1014          CALL???  ksesic7+125          0000007F0 000000000 000800228
                                                   000000000
kcbchg1_main+3115    CALL???  kcopcv+1014          00001E778 7FF00000001
                                                   00001E980 00001E650
kcbchg1+238          CALL???  kcbchg1_main+3115    00ED07040 00ED07040 000040013
                                                   000000000
ktuchg+1331          CALL???  kcbchg1+238          7FF00000000 000000003
                                                   00FF597C8 00FF59780
ktbchg2+341          CALL???  ktuchg+1331          000000000 000000401 000000046
                                                   000BB5D21
kdtchg+916           CALL???  ktbchg2+341          000000001 000000000 000000240
                                                   000000000
kdtwrp+2582          CALL???  kdtchg+916           01077A268 01044404C 010444054
                                                   010441710
kdtInsRow+705        CALL???  kdtwrp+2582          01077A268 7FF5E2F8510
                                                   003CB3358 0009C20C4
insrowFastPath+125   CALL???  kdtInsRow+705        00ED07040 7FF55A86E30
                                                   7FF58E87918 00001F318
insdrvFastPath+478   CALL???  insrowFastPath+125   000000067 000000000 000000000
                                                   000000000
inscovexe+434        CALL???  insdrvFastPath+478   01077A268 000000000 010440070
                                                   00521E0C9
insExecStmtExecIniE  CALL???  inscovexe+434        7FF55A88FD8 7FF55A86E30
ngine+99                                           00FF5BAD8 0009FA3EC
insexe+453           CALL???  insExecStmtExecIniE  8B2D554D9623 000000006
                              ngine+99             000000006 0000000C0
opiexe+4991          CALL???  insexe+453           7FF55A885A0 00FF5BAD8
                                                   000000102 000000000
opiall0+1931         CALL???  opiexe+4991          7FF00000049 000000003
                                                   00FF5C160 000000020
opikpr+660           CALL???  opiall0+1931         000000065 000000022 00FF5C638
                                                   000000000
opiodr+1136          CALL???  opikpr+660           000000065 000000017 01044B798
                                                   07EEF1FCF
rpidrus+230          CALL???  opiodr+1136          000000065 000000017 01044B798
                                                   80005900000000
rpidru+112           CALL???  rpidrus+230          00FF5D3F0 000000003 01078D4A0
                                                   7FF5B6F1A80
rpiswu2+517          CALL???  rpidru+112           7FF5DA52BB8 000000000
                                                   000000000 000000008
kprball+1446         CALL???  rpiswu2+517          7FF5E408820 000000000
                                                   00FF5DAD0 000000002
ktf_scn_time+4951    CALL???  kprball+1446         01044B798 8B2D00000140
                                                   000000000 000000005
ktmmon+4107          CALL???  ktf_scn_time+4951    000000000 000000001 07FFFFFFF
                                                   00521A3C2
ktmSmonMain+26       CALL???  ktmmon+4107          0049CBAC0 000000004
                                                   8B2D554D9623 00001E768
ksbrdp+903           CALL???  ktmSmonMain+26       003CC2A18 0049CBADC 000000008
                                                   000000004
opirip+700           CALL???  ksbrdp+903           726F77740000001E 003C8B000
                                                   00FF5FA30 000000000
opidrv+860           CALL???  opirip+700           000000032 000000004 00FF5FD50
                                                   000000000
sou2o+52             CALL???  opidrv+860           000000032 000000004 00FF5FD50
                                                   000000003
opimai_real+272      CALL???  sou2o+52             000000000 000000000 000000000
                                                   000000000
opimai+96            CALL???  opimai_real+272      000000000 000000000 000000000
                                                   000000000
BackgroundThreadSta  CALL???  opimai+96            00FF5FEA8 000000001 000000000
rt+633                                             000000000
000000007738F56D     CALL???  BackgroundThreadSta  0069E4590 000000000 000000000
                              rt+633               000000000
0000000077703021     CALL???  000000007738F56D     000000000 000000000 000000000
                                                   000000000
 
--------------------- Binary Stack Dump ---------------------

通过该trace和alert日志信息可以确认是由于smon_scn_time的操作需要使用到undo,但是对应的undo block异常,从而使得该操作失败,进而引起数据库smon进程异常从而引起ORA-00474,数据库自动crash.处理问题比较简单:
1. 对异常undo进行处理,创建新undo,删除老undo
2. 对于smon_scn_time异常数据进行处理

ORA-15063: ASM discovered an insufficient number of disks for diskgroup 恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-15063: ASM discovered an insufficient number of disks for diskgroup 恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户反馈三个磁盘组无法正常mount,报错类似ORA-15032 ORA-15017 ORA-15063

SQL> ALTER DISKGROUP ASM_DATA MOUNT  /* asm agent *//* {0:0:2} */ 
NOTE: cache registered group ASM_DATA number=1 incarn=0xffa85ccd
NOTE: cache began mount (first) of group ASM_DATA number=1 incarn=0xffa85ccd
ERROR: no read quorum in group: required 2, found 0 disks
NOTE: cache dismounting (clean) group 1/0xFFA85CCD (ASM_DATA) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 5709, image: oracle@XFF (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0xFFA85CCD (ASM_DATA) 
NOTE: cache ending mount (fail) of group ASM_DATA number=1 incarn=0xffa85ccd
NOTE: cache deleting context for group ASM_DATA 1/0xffa85ccd
Tue Jun 21 12:24:38 2022
NOTE: No asm libraries found in the system
ASM Health Checker found 1 new failures
GMON dismounting group 1 at 16 for pid 19, osid 5709
ERROR: diskgroup ASM_DATA was not mounted
ORA-15032: not all alterations performed
ORA-15017: diskgroup "ASM_DATA" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "ASM_DATA"
ERROR: ALTER DISKGROUP ASM_DATA MOUNT  /* asm agent *//* {0:0:2} */

初步判断是asm disk异常导致(比如asm disk不能被扫描到,或者丢失,或者磁盘头损坏等),分析客户的asm disk的udev文件配置

KERNEL=="sdd1", NAME="asm_grid", OWNER="grid", GROUP="asmadmin", MODE="0660"          
KERNEL=="sde1", NAME="asm_system", OWNER="grid", GROUP="asmadmin", MODE="0660"    
KERNEL=="sdf1", NAME="asm_data", OWNER="grid", GROUP="asmadmin", MODE="0660"     

从udev的配置中可以看出来,客户以前是对3个磁盘进行分析,然后使用udev映射别名给asm使用的.通过对其中一个磁盘进行分析
20220621220634
20220621220728


通过上述winhex查看,可以确认该分区的磁盘头信息异常[该信息属于磁盘刚分区的时候信息,而不是asm disk的信息],和kfed看到的结果一致[磁盘头位置肯定损坏,其他位置目前未知]

H:\TEMP\dd>kfed read sdf_sdf1.dd
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
0064D8400 00000000 00000000 00000000 00000000  [................]
        Repeat 26 times
0064D85B0 00000000 00000000 00000000 02000000  [................]
0064D85C0 FE8E0001 003FFFFF DFFC0000 0000257F  [......?......%..]
0064D85D0 00000000 00000000 00000000 00000000  [................]
        Repeat 1 times
0064D85F0 00000000 00000000 00000000 AA550000  [..............U.]
0064D8600 00000000 00000000 00000000 00000000  [................]
  Repeat 223 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

分析其他位置的block情况,初步看基本上ok[运气还不错]

H:\TEMP\dd>kfed read sdf_sdf1.dd blkn=2|grep kfbh.type
kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL

H:\TEMP\dd>kfed read sdf_sdf1.dd blkn=3|grep kfbh.type
kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL

H:\TEMP\dd>kfed read sdf_sdf1.dd blkn=1 aun=2|grep kfbh.type
kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL

通过检索备份出来的部分磁盘文件,找出来ORCLDISK信息部分(asm disk header)
20220621221843


然后利用这个部分对损坏的磁盘头进行修复,并且dd回生产环境中,并尝试mount磁盘组,数据库open成功
20220621181430
20220621222356


至此这个数据库运气不错,没有过多损坏,算完美恢复,可以进行了逻辑导出和rman备份,全部正常.为了后续安全,建议对其进行迁移