ora-600 2662和ora-600 kclchkblk_4恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ora-600 2662和ora-600 kclchkblk_4恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

这两天连续处理两个case,一个是12.1.0.2版本数据库屏蔽一致性,强制open之后,报ORA-600 2662故障
20210429220218


这个错误本身是一个非常常见的错误,直接推scn即可解决,但是问题是12.1.0.2版本,oracle不允许以前常规的操作方法,就连oradebug都报错oradebug poke ORA-32521/ORA-32519故障解决,而且可以是rac环境,bbed修改文件头也相当麻烦,最后我们使用patch方法轻松解决

另外一例是11.2.0.4版本,强制open库报ora-600 kclchkblk_4

Wed Apr 28 21:25:38 2021
SMON: enabling cache recovery
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Errors in file /u01/app/oracle/diag/rdbms/dc/dc1/trace/dc1_ora_27832.trc  (incident=564430):
ORA-00600: internal error code, arguments: [kclchkblk_4], [2959], [904341694], [2959], [904131717], []
Incident details in: /u01/app/oracle/diag/rdbms/dc/dc1/incident/incdir_564430/dc1_ora_27832_i564430.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/dc/dc1/trace/dc1_ora_27832.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [kclchkblk_4], [2959], [904341694], [2959], [904131717], []
Errors in file /u01/app/oracle/diag/rdbms/dc/dc1/trace/dc1_ora_27832.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [kclchkblk_4], [2959], [904341694], [2959], [904131717], []
Error 704 happened during db open, shutting down database
USER (ospid: 27832): terminating the instance due to error 704
Instance terminated by USER, pid = 27832
ORA-1092 signalled during: alter database open resetlogs...

这个比较简单,参考redo异常 ORA-600 kclchkblk_4 故障恢复.

ORA-600 kcratr_scan_lastbwr 恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 kcratr_scan_lastbwr 恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友找到我们,系统断电之后,数据库无法正常启动,报ora-600 kcratr_scan_lastbwr错误

Thu Mar 25 20:33:45 2021
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Ping without log force is disabled
.
Thu Mar 25 20:33:47 2021
Beginning crash recovery of 1 threads
 parallel recovery started with 32 processes
Thu Mar 25 20:33:47 2021
Started redo scan
Hex dump of (file 10, block 176517) in trace file C:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_4176.trc

Reading datafile 'C:\APP\ADMINISTRATOR\ORADATA\ORCL\XFF.DBF' for corruption at rdba: 0x0282b185 (file 10, block 176517)
Reread (file 10, block 176517) found same corrupt data (logically corrupt)
Write verification failed for File 10 Block 176517 (rdba 0x282b185)
Errors in file C:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_4176.trc  (incident=165355):
ORA-00600: ??????, ??: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Incident details in: C:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_165355\orcl_ora_4176_i165355.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Thu Mar 25 20:33:50 2021
Slave encountered ORA-10388 exception during crash recovery
Thu Mar 25 20:33:50 2021
Slave encountered ORA-10388 exception during crash recovery
Thu Mar 25 20:33:50 2021
Aborting crash recovery due to error 600
Thu Mar 25 20:33:59 2021
Errors in file C:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_4176.trc:
ORA-00600: ??????, ??: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Thu Mar 25 20:33:59 2021
Errors in file C:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_4176.trc:
ORA-00600: ??????, ??: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: alter database open...

故障原因,写丢失导致

Crash or instance recovery may fail because of a lost write even
though one of the mirrors has a good copy.  Reading a file header 
can corrupt a good mirror copy with a bad one.
 
Rediscovery Notes:
 ORA-600 [kcratr_scan_lostwrt] or ORA-600 [kcratr_scan_lastbwr] are signaled 
 even though one of the mirrors has a good copy.

解决方案比较简单直接recover顺利open库
20210326113024


对恢复案例:因对工作调整不满,链家一员工删除公司 9 TB数据:被判7年事件有感

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:对恢复案例:因对工作调整不满,链家一员工删除公司 9 TB数据:被判7年事件有感

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

从业生涯中恢复过各种删库的场景(有恶意,有无意),有rm数据库,rm数据文件,drop 数据库,dd asm磁盘头,drop table,truncate table,delete table等等.
两年前的一次给链家恢复EBS数据库案例,数据库服务器被人恶意rm 删除了/u01,/u02,rman备份目录等文件(还有ebs服务器相关程序也被删除,由另外一家公司对其进行处理),导致oracle集群异常,无可用备份恢复,通过对asm磁盘组直接恢复,实现数据0丢失.最近有了法院的终审判决:
北京市海淀区人民法院认为,被告人韩冰违反国家规定,对计算机信息系统中存储的数据和应用程序进行删除,造成计算机信息系统不能正常运行,后果特别严重,其行为已构成破坏计算机信息系统罪,依法应予惩处.海淀法院判决:被告人韩冰犯破坏计算机信息系统罪,判处有期徒刑七年。
由衷的感慨一个40岁的老dba,为了一时的赌气冲动的做出了错误的事情,需要使用自己的7年青春去弥补,实在不值得.
1000


具体的关于判决的描述,参考:因对工作调整不满,链家一员工删除公司 9 TB数据:被判7年
我们做dba的,把握了企业的数据命脉,是很多企业最核心的资产,切莫因为一些工作/生活中的赌气去破坏数据库,引起系统无法正常使用,甚至导致数据丢失,给企业带来损失的同时,也让自己失去自由.
对于我们维护的数据库:切莫冲动!!!切莫冲动!!!切莫冲动!!!

ORA-27303: failure occurred at: skgpwinit6

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-27303: failure occurred at: skgpwinit6

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在客户联系我们,数据库在运行过程中突然报ORA-27140 ORA-27300 ORA-27301 ORA-27302 ORA-27303: failure occurred at: skgpwinit6错误
20201231173708


通过Startup Instance Failed with ORA-27140 ORA-27300 ORA-27301 ORA-27302 and ORA-27303on skgpwinit6 (Doc ID 1274030.1)分析,确认可能是由于$ORACLE_HOME/bin/oracle文件权限异常导致,通过分析确认是有系统被chmod修改了权限导致数据库异常.把oracle全下修改为chmod 6751 oracle后,数据库启动正常

dblink会话引起library cache lock

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:dblink会话引起library cache lock

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户反馈,系统最近几天晚上都有卡顿
alert日志里面报如下错误
20201213180505


查看对应的trace文件
20201213180555

确定在收集统计信息的时候报ORA-04021错误.
查看这段时间ash报告
20201213180855

大量的library cache lock等待,而且引起等待的是类似语句,主要都是集中在一张表上,和收集统计信息报错的trace表匹配.
正好当天有对该表进行增加分区维护hang住
20201213181100

查询等待事件和阻塞情况

SQL> select distinct sid,a.BLOCKING_SESSION_STATUS,a.BLOCKING_INSTANCE,a.BLOCKING_SESSION
  2  ,event from gv$session a where sid=7399;

       SID BLOCKING_SE BLOCKING_INSTANCE BLOCKING_SESSION
---------- ----------- ----------------- ----------------
EVENT
----------------------------------------------------------------
      7399 VALID                       3             3593
library cache lock

SQL>  select a.INST_ID,a.sid,a.paddr,a.sql_id,a.event,a.MACHINE,a.PROGRAM 
   2 ,status from gv$session a where a.sid=3593;

   INST_ID        SID PADDR            SQL_ID
---------- ---------- ---------------- -------------
EVENT
----------------------------------------------------------------
MACHINE
----------------------------------------------------------------
PROGRAM                                          STATUS
------------------------------------------------ --------
         3       3593 0000001F91E80670 grxhz2vpmsrc6
SQL*Net message from dblink
WORKGROUP\XG
PlatformSyn.exe                                  ACTIVE

由于对应的sql_id 无法找到sql语句,不过根据等待事件基本上确认是调用一个dblink导致该问题,通过查询该回话,发现该回话一致处于active状态,但是一致无任何变化,实在可能处于僵死状态,对其进行kill之后,增加分区正常,收集统计信息正常.

再次遇到ORA-600 kokasgi1故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:再次遇到ORA-600 kokasgi1故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到一个朋友数据库无法启动,报错ORA-00600: 内部错误代码, 参数: [kokasgi1]错误:

Sun Nov 29 14:23:15 2020
Thread 1 advanced to log sequence 28572 (thread open)
Thread 1 opened at log sequence 28572
  Current log# 3 seq# 28572 mem# 0: /u01/app/oracle/oradata/xifenfei/redo03.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sun Nov 29 14:23:15 2020
SMON: enabling cache recovery
[3505] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:135724 end:135754 diff:30 (0 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
ORA-00600: 内部错误代码, 参数: [kokasgi1], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_3505.trc:
ORA-00600: 内部错误代码, 参数: [kokasgi1], [], [], [], [], [], [], [], [], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_3505.trc:
ORA-00600: 内部错误代码, 参数: [kokasgi1], [], [], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 3505): terminating the instance due to error 600
Instance terminated by USER, pid = 3505
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (3505) as a result of ORA-1092

对于这个错误,以前处理过一些类似故障,参考:
ORA-600 kokasgi1故障恢复
重命名sys用户引起数据库启动报ORA-01092 ORA-00600 kokasgi1错误
通过技术手段实现数据库open,查询user$信息
20201129175609


确认sys和system被分别被修改为sysa和systema,由于这个调整导致数据库无法正常启动.把sysa修改为sys,systema修改为system,数据库正常启动,业务测试正常.
20201129175803

sysaux表空间不足—WRH$_ACTIVE_SESSION_HISTORY

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:sysaux表空间不足—WRH$_ACTIVE_SESSION_HISTORY

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个老生常谈的问题遇到oracle sysaux表空间不足,通过awrinfo 分析,初步判断很可能是ash相关表没有正常清理分区导致
20201128205515


进一步分析确认异常对象
20201128205618

20201128205704

通过分析基本上可以确认是由于WRH$_ACTIVE_SESSION_HISTORY没有创建新分析,数据未被及时清理.根据官方WRH$_ACTIVE_SESSION_HISTORY Does Not Get Purged (Doc ID 387914.1)处理建议
alter session set “_swrf_test_action” = 72;
增加新分区,过一段时间之后,自动清理WRH$_ACTIVE_SESSION_HISTORY表历史数据.
20201128210106

对于sysaux表空间不足的情况,还遇到过类似情况:
sysaux中HEATMAP 对象较大
WRI$_ADV_OBJECTS表过大,导致SYSAUX表空间不足

模拟19c数据库root pdb undo异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:模拟19c数据库root pdb undo异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

对于19c在pdb情况下三种常见故障进行了模拟测试:
模拟19c数据库redo异常恢复
模拟19c数据库pdb undo异常恢复
模拟19c数据库root pdb undo异常恢复
模拟oracle 19c cdb模式下root pdb中undo丢失故障恢复
会话1,pdb中插入大量数据,未提交

SQL> alter session set container=pdb; 

Session altered.

SQL> alter database open;

Database altered.

SQL> create user xff identified by oracle default tablespace users;
grant dba to xff;
conn xff/oracle@127.0.0.1/pdb
create table t_xifenfei as select * from dba_objects;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;

User created.

SQL> 
Grant succeeded.

SQL> Connected.
SQL> 

Table created.

SQL> 
72351 rows created.

SQL> 
144702 rows created.

SQL> 
289404 rows created.

SQL> 
578808 rows created.

SQL> 

1157616 rows created.

SQL> SQL> SQL> 

会话2中root pdb模拟事务

[oracle@localhost ~]$ ss

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Nov 16 16:56:01 2020
Version 19.5.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.5.0.0.0

SQL> 
SQL> 
SQL> conn system/oracle
Connected.
SQL> create table t_xifenfei tablespace users as select * from dba_objects;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;

Table created.

SQL> 
72380 rows created.

SQL> 
144760 rows created.

SQL> 
289520 rows created.

SQL> 
579040 rows created.

SQL> 

1158080 rows created.

SQL> SQL> 

会话3 abort库并删除root pdb中undo文件

[oracle@localhost ~]$ ss

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Nov 16 16:56:55 2020
Version 19.5.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.5.0.0.0

SQL> shutdown abort;
ORACLE instance shut down.
SQL> 

[oracle@localhost oradata]$ cd ORA19C
[oracle@localhost ORA19C]$ ls
control01.ctl  control02.ctl  pdb  pdbseed  redo01.log  redo02.log  redo03.log  sysaux01.dbf
system01.dbf  temp01.dbf  undotbs01.dbf  users01.dbf
[oracle@localhost ORA19C]$ rm -rf undotbs01.dbf 

启动数据库报ORA-01157 ORA-01110错误

SQL> alter database datafile 4 offline drop;

Database altered.

SQL> alter database open;

Database altered.

SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            MOUNTED
SQL> alter session set container=pdb;

Session altered.

SQL> alter database open;

Database altered.

SQL> conn / as sysdba
Connected.
SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            READ WRITE NO
 
SQL> SQL> select tablespace_name,segment_name,status from dba_rollback_segs;

TABLESPACE_NAME                SEGMENT_NAME                   STATUS
------------------------------ ------------------------------ --------------------------------
SYSTEM                         SYSTEM                         ONLINE
UNDOTBS1                       _SYSSMU1_1261223759$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU2_27624015$             NEEDS RECOVERY
UNDOTBS1                       _SYSSMU3_2421748942$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU4_625702278$            NEEDS RECOVERY
UNDOTBS1                       _SYSSMU5_2101348960$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU6_813816332$            NEEDS RECOVERY
UNDOTBS1                       _SYSSMU7_2329891355$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU8_399776867$            NEEDS RECOVERY
UNDOTBS1                       _SYSSMU9_1692468413$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU10_930580995$           NEEDS RECOVERY

本次测试比较幸运,虽然undo段状态为NEEDS RECOVERY,但是数据库直接open成功.实际生产情况,可能比这个要复杂很多

模拟19c数据库pdb undo异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:模拟19c数据库pdb undo异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

对于19c在pdb情况下三种常见故障进行了模拟测试:
模拟19c数据库redo异常恢复
模拟19c数据库pdb undo异常恢复
模拟19c数据库root pdb undo异常恢复
测试在有事务的情况下,删除pdb中的undo数据库异常情况测试
会话1在root pdb中删除表记录,不提交

SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            READ WRITE NO
SQL> delete from system.t_xifenfei;

2316160 rows deleted.

会话2在pdb中删除表记录,不提交

SQL> conn xff/oracle@127.0.0.1/pdb
Connected.
SQL> delete from xff.t_xifenfei;

72351 rows deleted.

会话3 直接abort库

SQL> shutdown abort;
ORACLE instance shut down.

删除pdb中undo文件

[oracle@localhost pdb]$ rm -rf undotbs01.dbf 
[oracle@localhost pdb]$ 

启动数据库

SQL> startup
ORACLE instance started.

Total System Global Area 4999609088 bytes
Fixed Size                  9145088 bytes
Variable Size             905969664 bytes
Database Buffers         4076863488 bytes
Redo Buffers                7630848 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 11 - see DBWR trace file
ORA-01110: data file 11: '/u01/app/oracle/oradata/ORA19C/pdb/undotbs01.dbf'

offline异常文件,再open库

SQL> alter database datafile 11 offline drop;
alter database datafile 11 offline drop
*
ERROR at line 1:
ORA-01516: nonexistent log file, data file, or temporary file "11" in the current container

SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       MOUNTED
         3 PDB                            MOUNTED
SQL> alter session set container=pdb;

Session altered.

SQL> alter database datafile 11 offline drop;

Database altered.

SQL> conn / as sysdba
Connected.

SQL> startup 
ORACLE instance started.

Total System Global Area 4999609088 bytes
Fixed Size                  9145088 bytes
Variable Size             905969664 bytes
Database Buffers         4076863488 bytes
Redo Buffers                7630848 bytes
Database mounted.
Database opened.

open pdb

SQL> alter session set container=pdb;

Session altered.

SQL> 
SQL> alter database open;

Database altered.

SQL> conn / as sysdba
Connected.
SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            READ WRITE NO

测试中库open比较简单,后续只要对异常undo进行处理即可

SQL> create undo tablespace undotbs2 datafile '/u01/app/oracle/oradata/ORA19C/pdb/undotbs02.dbf' size 128M autoextend on;

Tablespace created.

SQL> alter system set undo_tablespace=undotbs2;

System altered.

SQL> select tablespace_name,segment_name,status from dba_rollback_segs;

TABLESPACE_NAME                SEGMENT_NAME                   STATUS
------------------------------ ------------------------------ --------------------------------
SYSTEM                         SYSTEM                         ONLINE
UNDOTBS1                       _SYSSMU1_3588498444$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU2_2971032042$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU3_3657342154$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU4_811969446$            NEEDS RECOVERY
UNDOTBS1                       _SYSSMU5_3018429039$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU6_442110264$            NEEDS RECOVERY
UNDOTBS1                       _SYSSMU7_2728255665$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU8_801938064$            NEEDS RECOVERY
UNDOTBS1                       _SYSSMU9_647420285$            NEEDS RECOVERY
UNDOTBS1                       _SYSSMU10_2262159254$          NEEDS RECOVERY

SQL>  drop tablespace undotbs1 including contents and datafiles;

Tablespace dropped.

在测试中,undo有事务的情况下,数据库可以正常open,而且运行了一段时间未crash,在这个方面确实比11g及其以前版本有很大改进.当然由于测试环境本身比较单一,可能实际生产中会此类故障处理比较复杂

模拟19c数据库redo异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:模拟19c数据库redo异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

对于19c在pdb情况下三种常见故障进行了模拟测试:
模拟19c数据库redo异常恢复
模拟19c数据库pdb undo异常恢复
模拟19c数据库root pdb undo异常恢复
模拟oracle 19c数据库redo丢失的恢复操作,模拟数据库有事务,在没有提交的情况下redo丢失故障

[oracle@localhost oradata]$ ss

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Nov 16 16:11:16 2020
Version 19.5.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.5.0.0.0

SQL> conn xff/oracle@127.0.0.1/pdb
Connected.
SQL> create table t_xifenfei as select * from dba_objects;

Table created.

SQL> insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;

72351 rows created.

SQL> 
144702 rows created.

SQL> 
289404 rows created.

SQL> 
578808 rows created.

SQL> 

1157616 rows created.

另外一个会话kill数据库并且删除redo

[root@localhost ~]# ps -ef|grep pmon
oracle    38500      1  0 16:08 ?        00:00:00 ora_pmon_ora19c
root      39030  39009  0 16:11 pts/2    00:00:00 grep --color=auto pmon
[root@localhost ~]# kill -9 38500
[root@localhost ~]# ps -ef|grep pmon
root      39042  39009  0 16:11 pts/2    00:00:00 grep --color=auto pmon
[root@localhost ~]# ls -l /u01/app/oracle/oradata/ORA19C/redo*.log
ls: cannot access /u01/app/oracle/oradata/ORA19C/redo*.log: No such file or directory

启动数据库报错ORA-00313 ORA-00312 ORA-27037

SQL> startup 
ORACLE instance started.

Total System Global Area 4999609088 bytes
Fixed Size                  9145088 bytes
Variable Size             905969664 bytes
Database Buffers         4076863488 bytes
Redo Buffers                7630848 bytes
Database mounted.
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/u01/app/oracle/oradata/ORA19C/redo03.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7

因为redo全部丢失只能尝试强制拉库

SQL> startup mount pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area 4999609088 bytes
Fixed Size                  9145088 bytes
Variable Size             905969664 bytes
Database Buffers         4076863488 bytes
Redo Buffers                7630848 bytes
Database mounted.
SQL>  recover database until cancel;
ORA-00279: change 2335666 generated at 11/16/2020 16:08:42 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/product/19.2/db_1/dbs/arch1_12_1056620100.dbf
ORA-00280: change 2335666 for thread 1 is in sequence #12


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/app/oracle/oradata/ORA19C/system01.dbf'


ORA-01112: media recovery not started


SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [],
[], [], [], [], [], [], []
Process ID: 39588
Session ID: 9 Serial number: 32012

数据库报ORA-600 kcbzib_kcrsds_1错误是由于在强制拉库过程中文件异常导致,通过对异常文件进行处理数据库open成功

SQL> recover database;
Media recovery complete.
SQL> alter database open;

Database altered.

SQL> 

SQL> alter session set container=pdb;

Session altered.

SQL> alter database open;

SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            READ WRITE NO

这个是模拟redo丢失或者损坏故障,在实际的生产故障中可能要比这个复杂很多.