一次 CRS-1013: ASM 磁盘组中的 OCR 位置不可访问 故障分析

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:一次 CRS-1013: ASM 磁盘组中的 OCR 位置不可访问 故障分析

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友告知我集群突然异常,让我给看看什么原因,集群alert日志

2021-03-14 21:02:15.517 [OHASD(31771)]CRS-8500: Oracle Clusterware OHASD 进程以操作系统进程 ID 31771 开头
2021-03-14 21:02:15.561 [OHASD(31771)]CRS-0714: Oracle Clusterware 发行版 12.1.0.2.0。
2021-03-14 21:02:15.619 [OHASD(31771)]CRS-2112: 已在节点 rac1 上启动 OLR 服务。
2021-03-14 21:02:15.791 [OHASD(31771)]CRS-1301: 已在节点 rac1 上启动 Oracle 高可用性服务。
2021-03-14 21:02:15.910 [OHASD(31771)]CRS-8017: 位置:/etc/oracle/lastgasp具有2个重新启动指导日志文件,0个已发布,0个出现错误
2021-03-14 21:02:16.789 [CSSDAGENT(32015)]CRS-8500: Oracle Clusterware CSSDAGENT 进程以操作系统进程 ID 32015 开头
2021-03-14 21:02:16.868 [CSSDMONITOR(32017)]CRS-8500: Oracle Clusterware CSSDMONITOR 进程以操作系统进程 ID 32017 开头
2021-03-14 21:02:17.751 [ORAROOTAGENT(32008)]CRS-8500: Oracle Clusterware ORAROOTAGENT 进程以操作系统进程 ID 32008 开头
2021-03-14 21:02:17.916 [ORAAGENT(32012)]CRS-8500: Oracle Clusterware ORAAGENT 进程以操作系统进程 ID 32012 开头
2021-03-14 21:02:18.604 [ORAAGENT(32012)]CRS-5011: 检查资源 "ora.asm" 失败: 详细资料见 "(:CLSN00006:)" 
(位于 "/u01/app/gridbase/diag/crs/rac1/crs/trace/ohasd_oraagent_grid.trc")
2021-03-14 21:02:18.969 [ORAAGENT(32117)]CRS-8500: Oracle Clusterware ORAAGENT 进程以操作系统进程 ID 32117 开头
2021-03-14 21:02:19.050 [MDNSD(32130)]CRS-8500: Oracle Clusterware MDNSD 进程以操作系统进程 ID 32130 开头
2021-03-14 21:02:19.117 [EVMD(32132)]CRS-8500: Oracle Clusterware EVMD 进程以操作系统进程 ID 32132 开头
2021-03-14 21:02:20.078 [GPNPD(32151)]CRS-8500: Oracle Clusterware GPNPD 进程以操作系统进程 ID 32151 开头
2021-03-14 21:02:21.145 [GIPCD(32172)]CRS-8500: Oracle Clusterware GIPCD 进程以操作系统进程 ID 32172 开头
2021-03-14 21:02:21.163 [GPNPD(32151)]CRS-2328: 已在节点 rac1 上启动 GPNPD。
2021-03-14 21:02:22.172 [ORAROOTAGENT(32181)]CRS-8500: Oracle Clusterware ORAROOTAGENT 进程以操作系统进程 ID 32181 开头
2021-03-14 21:02:22.339 [CLSECHO(32204)]CRS-10001: 14-Mar-21 21:02 ACFS-9391: 正在检查现有 ADVM/ACFS 安装。
2021-03-14 21:02:22.580 [CLSECHO(32209)]CRS-10001: 14-Mar-21 21:02 ACFS-9392: 正在验证操作系统的 ADVM/ACFS 安装文件。
2021-03-14 21:02:22.598 [CLSECHO(32211)]CRS-10001: 14-Mar-21 21:02 ACFS-9393: 正在验证 ASM 管理员设置。
2021-03-14 21:02:22.646 [CLSECHO(32216)]CRS-10001: 14-Mar-21 21:02 ACFS-9308: 正在加载已安装的 ADVM/ACFS 驱动程序。
2021-03-14 21:02:22.678 [CLSECHO(32219)]CRS-10001: 14-Mar-21 21:02 ACFS-9154: 正在加载 'oracleoks.ko' 驱动程序。
2021-03-14 21:02:22.809 [CLSECHO(32234)]CRS-10001: 14-Mar-21 21:02 ACFS-9154: 正在加载 'oracleadvm.ko' 驱动程序。
2021-03-14 21:02:22.892 [CLSECHO(32290)]CRS-10001: 14-Mar-21 21:02 ACFS-9154: 正在加载 'oracleacfs.ko' 驱动程序。
2021-03-14 21:02:23.054 [CLSECHO(32334)]CRS-10001: 14-Mar-21 21:02 ACFS-9327: 正在验证 ADVM/ACFS 设备。
2021-03-14 21:02:23.079 [CLSECHO(32336)]CRS-10001: 14-Mar-21 21:02 ACFS-9156: 正在检测控制设备 '/dev/asm/.asm_ctl_spec'。
2021-03-14 21:02:23.108 [CLSECHO(32340)]CRS-10001: 14-Mar-21 21:02 ACFS-9156: 正在检测控制设备 '/dev/ofsctl'。
2021-03-14 21:02:23.263 [CLSECHO(32346)]CRS-10001: 14-Mar-21 21:02 ACFS-9322: 已完成
2021-03-14 21:02:28.571 [CSSDMONITOR(32409)]CRS-8500: Oracle Clusterware CSSDMONITOR 进程以操作系统进程 ID 32409 开头
2021-03-14 21:02:28.756 [CSSDAGENT(32425)]CRS-8500: Oracle Clusterware CSSDAGENT 进程以操作系统进程 ID 32425 开头
2021-03-14 21:02:28.975 [OCSSD(32436)]CRS-8500: Oracle Clusterware OCSSD 进程以操作系统进程 ID 32436 开头
2021-03-14 21:02:30.072 [OCSSD(32436)]CRS-1713: CSSD 守护程序已在 hub 模式下启动
2021-03-14 21:02:46.185 [OCSSD(32436)]CRS-1707: 节点 rac1 (编号为 1) 的租约获取已完成
2021-03-14 21:02:47.337 [OCSSD(32436)]CRS-1605: CSSD 表决文件联机: ORCL:OCR3; 详细资料见
 /u01/app/gridbase/diag/crs/rac1/crs/trace/ocssd.trc。
2021-03-14 21:02:47.357 [OCSSD(32436)]CRS-1605: CSSD 表决文件联机: ORCL:OCR2; 详细资料见 
/u01/app/gridbase/diag/crs/rac1/crs/trace/ocssd.trc。
2021-03-14 21:02:47.365 [OCSSD(32436)]CRS-1605: CSSD 表决文件联机: ORCL:OCR1; 详细资料见 
/u01/app/gridbase/diag/crs/rac1/crs/trace/ocssd.trc。
2021-03-14 21:02:48.781 [OCSSD(32436)]CRS-1601: CSSD 重新配置完毕。活动节点为 rac1 rac2 。
2021-03-14 21:02:50.971 [OCTSSD(32591)]CRS-8500: Oracle Clusterware OCTSSD 进程以操作系统进程 ID 32591 开头
2021-03-14 21:02:51.938 [OCTSSD(32591)]CRS-2403: 主机 rac1 上的集群时间同步服务处于观察程序模式。
2021-03-14 21:02:52.140 [OCTSSD(32591)]CRS-2407: 新的集群时间同步服务引用节点为主机 rac2。
2021-03-14 21:02:52.140 [OCTSSD(32591)]CRS-2401: 已在主机 rac1 上启动了集群时间同步服务。
2021-03-14 21:02:52.167 [OCTSSD(32591)]CRS-2409: 主机 rac1 上的时钟与集群标准时间不同步。
由于集群时间同步服务正在以观察程序模式运行, 所以未采取任何操作。
2021-03-14 21:02:59.284 [ORAAGENT(32117)]CRS-5011: 检查资源 "ora.asm" 失败: 详细资料见 "(:CLSN00006:)" (
位于 "/u01/app/gridbase/diag/crs/rac1/crs/trace/ohasd_oraagent_grid.trc")
2021-03-14 21:03:01.486 [ORAAGENT(32117)]CRS-5011: 检查资源 "ora.asm" 失败: 详细资料见 "(:CLSN00006:)" (
位于 "/u01/app/gridbase/diag/crs/rac1/crs/trace/ohasd_oraagent_grid.trc")
2021-03-14 21:03:01.514 [ORAAGENT(32117)]CRS-5011: 检查资源 "ora.asm" 失败: 详细资料见 "(:CLSN00006:)" (
位于 "/u01/app/gridbase/diag/crs/rac1/crs/trace/ohasd_oraagent_grid.trc")
2021-03-14 21:03:18.163 [OCTSSD(32591)]CRS-2407: 新的集群时间同步服务引用节点为主机 rac1。
2021-03-14 21:03:19.406 [OCSSD(32436)]CRS-1625: 节点 rac2 (编号为 2) 已关闭
2021-03-14 21:03:19.419 [OCSSD(32436)]CRS-1601: CSSD 重新配置完毕。活动节点为 rac1 。
2021-03-14 21:03:24.916 [OSYSMOND(318)]CRS-8500: Oracle Clusterware OSYSMOND 进程以操作系统进程 ID 318 开头
2021-03-14 21:03:26.558 [CRSD(325)]CRS-8500: Oracle Clusterware CRSD 进程以操作系统进程 ID 325 开头
2021-03-14 21:03:27.750 [CRSD(325)]CRS-1012: 已在节点 rac1 上启动 OCR 服务。
2021-03-14 21:03:27.807 [CRSD(325)]CRS-1201: 已在节点 rac1 上启动 CRSD。
2021-03-14 21:03:28.470 [ORAAGENT(1027)]CRS-8500: Oracle Clusterware ORAAGENT 进程以操作系统进程 ID 1027 开头
2021-03-14 21:03:28.499 [ORAROOTAGENT(1031)]CRS-8500: Oracle Clusterware ORAROOTAGENT 进程以操作系统进程 ID 1031 开头
2021-03-14 21:03:28.515 [ORAAGENT(1036)]CRS-8500: Oracle Clusterware ORAAGENT 进程以操作系统进程 ID 1036 开头
2021-03-14 21:03:28.666 [ORAAGENT(1036)]CRS-5011: 检查资源 "oracledb" 失败: 详细资料见 "(:CLSN00007:)"
 (位于 "/u01/app/gridbase/diag/crs/rac1/crs/trace/crsd_oraagent_oracle.trc")
2021-03-14 21:03:30.649 [ORAAGENT(32117)]CRS-5011: 检查资源 "ora.asm" 失败: 详细资料见 "(:CLSN00006:)" 
(位于 "/u01/app/gridbase/diag/crs/rac1/crs/trace/ohasd_oraagent_grid.trc")
2021-03-14 21:03:30.718 [ORAAGENT(32117)]CRS-5011: 检查资源 "ora.asm" 失败: 详细资料见 "(:CLSN00006:)" 
(位于 "/u01/app/gridbase/diag/crs/rac1/crs/trace/ohasd_oraagent_grid.trc")
2021-03-14 21:03:30.722 [CRSD(325)]CRS-1024: 由于此节点上的 ASM 实例未处于活动状态, 此节点上的集群就绪服务终止。详细信息见 
(:PROCR00009:) (位于 /u01/app/gridbase/diag/crs/rac1/crs/trace/crsd.trc)。
2021-03-14 21:03:30.736 [ORAROOTAGENT(1031)]CRS-5822: 代理 '/u01/app/grid/12.1.0/bin/orarootagent_root' 已从服务器断开连接。
详细资料见 (:CRSAGF00117:) {0:3:3} (位于 /u01/app/gridbase/diag/crs/rac1/crs/trace/crsd_orarootagent_root.trc)。
2021-03-14 21:03:30.736 [ORAAGENT(1027)]CRS-5822: 代理 '/u01/app/grid/12.1.0/bin/oraagent_grid' 已从服务器断开连接。
详细资料见 (:CRSAGF00117:) {0:1:3} (位于 /u01/app/gridbase/diag/crs/rac1/crs/trace/crsd_oraagent_grid.trc)。
2021-03-14 21:03:30.793 [CRSD(1157)]CRS-8500: Oracle Clusterware CRSD 进程以操作系统进程 ID 1157 开头
2021-03-14 21:03:31.457 [OLOGGERD(1162)]CRS-8500: Oracle Clusterware OLOGGERD 进程以操作系统进程 ID 1162 开头
2021-03-14 21:03:31.798 [ORAAGENT(32117)]CRS-5011: 检查资源 "ora.asm" 失败: 详细资料见 "(:CLSN00006:)" 
(位于 "/u01/app/gridbase/diag/crs/rac1/crs/trace/ohasd_oraagent_grid.trc")
2021-03-14 21:03:31.823 [ORAAGENT(32117)]CRS-5011: 检查资源 "ora.asm" 失败: 详细资料见 "(:CLSN00006:)" 
(位于 "/u01/app/gridbase/diag/crs/rac1/crs/trace/ohasd_oraagent_grid.trc")
2021-03-14 21:03:40.234 [CRSD(1157)]CRS-1013: ASM 磁盘组中的 OCR 位置不可访问。
详细资料见 /u01/app/gridbase/diag/crs/rac1/crs/trace/crsd.trc。
2021-03-14 21:03:40.238 [CRSD(1157)]CRS-0804: 由于 Oracle 集群注册表错误 [PROC-26: 访问物理存储时出错
ORA-15077: 找不到提供所需磁盘组的 ASM 实例
], 集群就绪服务中止。详细资料见 (:CRSD00111:) (位于 /u01/app/gridbase/diag/crs/rac1/crs/trace/crsd.trc)。

从整个集群的启动过程看cssd,crs都起来了,然后等一会由于crs无法访问ocr磁盘组,导致异常.开始crs起来了,证明ocr磁盘组应该是mount成功过.后面看错误提示又无法访问了.根据经验以及ora.asm失败的提示,怀疑很可能是asm实例出现问题了.对于这样的情况,分析asm的alert日志是最好的方法.通过分析日志发现

Sun Mar 14 21:03:24 2021
NOTE: Instance updated compatible.asm to 12.1.0.0.0 for grp 1
Sun Mar 14 21:03:24 2021
SUCCESS: diskgroup ARCHLOG was mounted
Sun Mar 14 21:03:24 2021
NOTE: Instance updated compatible.asm to 12.1.0.0.0 for grp 2
Sun Mar 14 21:03:24 2021
SUCCESS: diskgroup DATA was mounted
Sun Mar 14 21:03:24 2021
NOTE: Instance updated compatible.asm to 12.1.0.0.0 for grp 3
Sun Mar 14 21:03:24 2021
SUCCESS: diskgroup OCR was mounted
Sun Mar 14 21:03:24 2021
NOTE: Instance updated compatible.asm to 12.1.0.0.0 for grp 5
Sun Mar 14 21:03:24 2021
SUCCESS: ALTER DISKGROUP ALL MOUNT /* asm agent call crs *//* {0:9:3} */
Sun Mar 14 21:03:24 2021
WARNING: failed to online diskgroup resource ora.ARCHLOG.dg (unable to communicate with CRSD/OHASD)
WARNING: failed to online diskgroup resource ora.DATA.dg (unable to communicate with CRSD/OHASD)
WARNING: failed to online diskgroup resource ora.OCR.dg (unable to communicate with CRSD/OHASD)
Errors in file /u01/app/gridbase/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_32721.trc  (incident=123423):
ORA-00600: internal error code, arguments: [kfdAuDealloc2], [ARCHLOG], [213], [410], [0], [], [], [], [], [], [], []
Incident details in: /u01/app/gridbase/diag/asm/+asm/+ASM1/incident/incdir_123423/+ASM1_rbal_32721_i123423.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sun Mar 14 21:03:25 2021
ERROR: An unrecoverable error has been identified in ASM metadata.
Sun Mar 14 21:03:27 2021
NOTE: [crsd.bin@rac1.schic.org (TNS V1-V3) 325] opening OCR file +OCR.255.4294967295
Starting background process ASMB
Sun Mar 14 21:03:27 2021
ASMB started with pid=28, OS id=932 
Sun Mar 14 21:03:27 2021
NOTE: ASMB registering with ASM instance as Standard client 0xffffffffffffffff (reg:3401595347) (new connection)
Sun Mar 14 21:03:27 2021
NOTE: Standard client +ASM1:+ASM:racscan registered, osid 934, mbr 0x0, asmb 932 (reg:3401595347)
Sun Mar 14 21:03:27 2021
NOTE: ASMB connected to ASM instance +ASM1 osid: 934 (Flex mode; client id 0xffffffffffffffff)
Sun Mar 14 21:03:28 2021
NOTE: AMDU dump of disk group ARCHLOG initiated at /u01/app/gridbase/diag/asm/+asm/+ASM1/incident/incdir_123423
ERROR: ORA-600 in COD recovery for diskgroup 1/0x730955f2 (ARCHLOG)
ERROR: ORA-600 thrown in RBAL for group number 1
Sun Mar 14 21:03:30 2021
Errors in file /u01/app/gridbase/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_32721.trc:
ORA-00600: internal error code, arguments: [kfdAuDealloc2], [ARCHLOG], [213], [410], [0], [], [], [], [], [], [], []
Sun Mar 14 21:03:30 2021
Errors in file /u01/app/gridbase/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_32721.trc:
ORA-00600: internal error code, arguments: [kfdAuDealloc2], [ARCHLOG], [213], [410], [0], [], [], [], [], [], [], []
USER (ospid: 32721): terminating the instance due to error 488
Sun Mar 14 21:03:30 2021
System state dump requested by (instance=1, osid=32721 (RBAL)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/gridbase/diag/asm/+asm/+ASM1/trace/+ASM1_diag_32691_20210314210330.trc
Sun Mar 14 21:03:30 2021
Dumping diagnostic data in directory=[cdmp_20210314210330], requested by (instance=1, osid=32721 (RBAL)), s
ummary=[abnormal instance termination].
Sun Mar 14 21:03:30 2021
Instance terminated by USER, pid = 32721

通过上述日志,果然发现ocr磁盘组先mount成功,然后asm实例由于ARCHLOG磁盘组的ORA-00600 kfdAuDealloc2错误而导致整个实例crash,从而使得ocr磁盘组无法被crs访问,从而出现了”CRS-0804: 由于 Oracle 集群注册表错误 [PROC-26: 访问物理存储时出错 ORA-15077: 找不到提供所需磁盘组的 ASM 实例], 集群就绪服务中止”这样的错误提示.进一步分析为什么archlog进程会报这个错误.

SQL> /* ASMCMD */alter diskgroup /*ASMCMD*/ "DATA" drop file '+DATA/xff/XIFENFEI.270.1040985885' 
Sun Mar 14 20:46:46 2021
SUCCESS: /* ASMCMD */alter diskgroup /*ASMCMD*/ "DATA" drop file '+DATA/xff/XIFENFEI.270.1040985885'
Sun Mar 14 20:49:24 2021
NOTE: Dropping directory '+archlog/oracledb/archivelog/2021_03_11' recursively
Sun Mar 14 20:49:24 2021
Errors in file /u01/app/gridbase/diag/asm/+asm/+ASM1/trace/+ASM1_ora_15281.trc  (incident=114015):
ORA-00600: internal error code, arguments: [kfdAuDealloc2], [ARCHLOG], [213], [410], [0], [], [], [], [], [], [], []
Incident details in: /u01/app/gridbase/diag/asm/+asm/+ASM1/incident/incdir_114015/+ASM1_ora_15281_i114015.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sun Mar 14 20:49:24 2021
ERROR: An unrecoverable error has been identified in ASM metadata.
NOTE:AMDU dump of disk group ARCHLOG initiated at/u01/app/gridbase/diag/asm/+asm/+ASM1/incident/incdir_114015
Sun Mar 14 20:49:28 2021
Errors in file /u01/app/gridbase/diag/asm/+asm/+ASM1/trace/+ASM1_ora_15281.trc  (incident=114016):
ORA-00600: internal error code, arguments: [kfdAuDealloc2], [ARCHLOG], [213], [410], [0], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kfdAuDealloc2], [ARCHLOG], [213], [410], [0], [], [], [], [], [], [], []

因为这个库有一个历史背景:几天前由于存储cache导致,数据库使用备份还原(还原到一个新磁盘组中,老磁盘组没有使用),今天估计是运维人员在清理老磁盘组中不要的文件,然后archlog中的归档日志的时候,清空了+archlog/oracledb/archivelog/2021_03_11中的文件,然后触发asm删除该目录的异常(异常原因估计和上次清理存储cache引起了该磁盘组的元数据异常有关).该故障的基本思路原因已经清楚:由于archlog磁盘组本身元数据库有问题,清理该磁盘组文件之后,引起该磁盘组删除空目录出发问题,从而使得整个asm 实例crash.进而引起crs异常.解决方法比较简单,因为archlog磁盘组本身已经不需要,直接dd掉磁盘头,让其启动的时候不再mount,故障解决

[grid@rac1 ~]$ crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCHLOG.dg
               ONLINE  OFFLINE      rac1                     STABLE
               ONLINE  OFFLINE      rac2                     STABLE
ora.DATA.dg
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.LISTENER1.lsnr
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.NEWDATA.dg
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.OCR.dg
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.asm
               ONLINE  ONLINE       rac1                     Started,STABLE
               ONLINE  ONLINE       rac2                     Started,STABLE
ora.net1.network
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.ons
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac2                     STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       rac1                     STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       rac1                     STABLE
ora.MGMTLSNR
      1        ONLINE  OFFLINE      rac2                     169.254.86.142 7.7.7
                                                             .1,STARTING
ora.cvu
      1        ONLINE  ONLINE       rac1                     STABLE
ora.mgmtdb
      1        ONLINE  OFFLINE                               Instance Shutdown,ST
                                                             ABLE
ora.oc4j
      1        ONLINE  OFFLINE      rac1                     STARTING
ora.xff.db
      1        ONLINE  OFFLINE      rac1                     STARTING
      2        ONLINE  OFFLINE      rac2                     STARTING
ora.rac1.vip
      1        ONLINE  ONLINE       rac1                     STABLE
ora.rac2.vip
      1        ONLINE  ONLINE       rac2                     STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       rac2                     STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       rac1                     STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       rac1                     STABLE
--------------------------------------------------------------------------------
[grid@rac1 ~]$ 

select default$ from col$ where rowid=:1 大量解析

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:select default$ from col$ where rowid=:1 大量解析

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在一个12.1.0.2的库的awr中发现大量47r1y8yn34jmj语句的解析
47r1y8yn34jmj


对应的完整sql为:select default$ from col$ where rowid=:1,按道理说正常的库不应该出现大量该类sql的解析,查询mos发现相关Bug 20907061 : HIGH # OF EXECUTIONS FOR RECURSIVE CALL ON COL$
20907061


DELETE FROM wri$_adv_sqlt_rtn_planWHERE task_id = :tid AND exec_name = :execution_name

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:DELETE FROM wri$_adv_sqlt_rtn_planWHERE task_id = :tid AND exec_name = :execution_name

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库版本

SQL> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production              0
PL/SQL Release 12.2.0.1.0 - Production                                                    0
CORE    12.2.0.1.0      Production                                                                0
TNS for Linux: Version 12.2.0.1.0 - Production                                            0
NLSRTL Version 12.2.0.1.0 - Production                                                    0

alert 日志报错

2019-08-14T11:30:15.112151+08:00
WARNING: too many parse errors, count=546 SQL hash=0x750004bb
PARSE ERROR: ospid=11550, error=933 for statement:
2019-08-14T11:30:15.112224+08:00
DELETE FROM wri$_adv_sqlt_rtn_planWHERE task_id = :tid AND exec_name = :execution_name
Additional information: hd=0x16a3e1db8 phd=0x1699bf628 flg=0x28 cisid=0 sid=0 ciuid=0 uid=0
2019-08-14T11:30:15.114628+08:00
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0xd0ba9890       259  type body SYS.WRI$_ADV_SQLTUNE.SUB_DELETE_EXECUTION
0x870ac548      2134  package body SYS.PRVT_ADVISOR.COMMON_DELETE_TASK
0x870ac548      7342  package body SYS.PRVT_ADVISOR.DELETE_EXPIRED_TASKS
0xc91e5518         1  anonymous block
WARNING: too many parse errors, count=646 SQL hash=0x750004bb
PARSE ERROR: ospid=11550, error=933 for statement:
2019-08-14T11:30:15.298603+08:00
DELETE FROM wri$_adv_sqlt_rtn_planWHERE task_id = :tid AND exec_name = :execution_name
Additional information: hd=0x16a3e1db8 phd=0x1699bf628 flg=0x28 cisid=0 sid=0 ciuid=0 uid=0
2019-08-14T11:30:15.298698+08:00
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0xd0ba9890       259  type body SYS.WRI$_ADV_SQLTUNE.SUB_DELETE_EXECUTION
0x870ac548      2134  package body SYS.PRVT_ADVISOR.COMMON_DELETE_TASK
0x870ac548      7342  package body SYS.PRVT_ADVISOR.DELETE_EXPIRED_TASKS
0xc91e5518         1  anonymous block

这里比较明显由于DELETE FROM wri$_adv_sqlt_rtn_planWHERE这条sql语法不对,导致无法解析因此报了ORA-00933错误,通过人工执行

SQL>  exec SYS.PRVT_ADVISOR.DELETE_EXPIRED_TASKS();
PL/SQL procedure successfully completed.

后台alert日志重现该错误,证明该程序本身有问题,属于oracle bug范畴,查询mos发现相关Bug 26764561 : ORA-00933 IN SYS.WRI$_ADV_SQLTUNE.SUB_DELETE_EXECUTION
26764561


可以打上对应Patch 26764561: ORA-00933 IN SYS.WRI$_ADV_SQLTUNE.SUB_DELETE_EXECUTION

回收站中有大量wri$_rcs表

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:回收站中有大量wri$_rcs表

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在对一套Oracle 12.1.0.2的数据库巡检之时发现大量WRI$_RCS_数字_1的表在回收站中,从命名中看该表应该是Oracle某个自动任务处理后,表未被正常处理干净,遗留在回收站中.
wri$_rcs


查询mos确认是Bug 20114306 – Objects left in recyclebin after upgrade to 12.1.0.2 or with fix for bug 16851194 present – superseded (文档 ID 20114306.8)
20114306


可以尝试打上补丁21498770或者23100700然后设置_fix_control

alter system set "_fix_control"='16851194:off' ;

确认该_fix_control是否可以设置,可以查询 v$system_fix_control视图

ORA-07445 qcdlgcd

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-07445 qcdlgcd

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

SQL执行报ORA-03113错误

SQL> SELECT *
  FROM USER.TABLE_NAME t
 WHERE t.YEAR = '2019'
   AND t.UPPCODE = '51010000'
   AND  t.MONTH = '01';
  2    3    4    5
SELECT *
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 68389
Session ID: 2419 Serial number: 34370

数据库版本等信息

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
ORACLE_HOME = /oracle/product/db12cr1
System name:	Linux
Node name:	Tkfcsdb
Release:	2.6.32-431.el6.x86_64
Version:	#1 SMP Sun Nov 10 22:19:54 EST 2013
Machine:	x86_64

alert日志报错

Thu Sep 27 17:41:15 2018
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x4] [PC:0xCDFB0C6, qcdlgcd()+70] [flags: 0x0, count: 1]
Errors in file /oracle/diag/rdbms/tst12uf/tst12uf/trace/tst12uf_ora_234433.trc  (incident=126129) (PDBNAME=PTST12UF):
ORA-07445:exception encountered:core dump[qcdlgcd()+70][SIGSEGV][ADDR:0x4][PC:0xCDFB0C6][Address not mapped to object][]
Incident details in: /oracle/diag/rdbms/tst12uf/tst12uf/incident/incdir_126129/tst12uf_ora_234433_i126129.trc
Use ADRCI or Support Workbench to package the incident.

trace文件

----- Beginning of Customized Incident Dump(s) -----
Dumping swap information
Memory (Avail / Total) = 948.73M / 63736.63M
Swap (Avail / Total) = 30025.52M /  32255.99M
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x4] [PC:0xCDFB0C6, qcdlgcd()+70] [flags: 0x0, count: 1]
Registers:
%rax: 0x0000000000000000 %rbx: 0x000000020c3da3f0 %rcx: 0x00007fc492268a40
%rdx: 0x00007fc49239eab0 %rdi: 0x000000022cd5da20 %rsi: 0x00007fc492f2bc80
%rsp: 0x00007fffbd9c9430 %rbp: 0x00007fffbd9c94a0  %r8: 0x00000002372efcb8
 %r9: 0x000000022cd5da20 %r10: 0x00007fc49239eab0 %r11: 0x00007fc492268a40
%r12: 0x000000022cd5da20 %r13: 0x00007fc492f2bc80 %r14: 0x00007fc49239eab0
%r15: 0x00007fc492268a40 %rip: 0x000000000cdfb0c6 %efl: 0x0000000000010206
  qcdlgcd()+53 (0xcdfb0b5) mov %rdi,%r12
  qcdlgcd()+56 (0xcdfb0b8) mov 0x10(%rax),%rbx
  qcdlgcd()+60 (0xcdfb0bc) jz 0xcdfb29e
  qcdlgcd()+66 (0xcdfb0c2) mov 0x60(%r15),%rax
> qcdlgcd()+70 (0xcdfb0c6) movzwl 0x4(%rax),%r8d
  qcdlgcd()+75 (0xcdfb0cb) cmp $30,%r8d
  qcdlgcd()+79 (0xcdfb0cf) jnle 0xcdfb277
  qcdlgcd()+85 (0xcdfb0d5) pxor %xmm0,%xmm0
  qcdlgcd()+89 (0xcdfb0d9) movaps %xmm0,-0x40(%rbp)
*** 2018-09-27 17:41:15.110
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x3, level=3, mask=0x0)
[TOC00004]
----- Current SQL Statement for this session (sql_id=52chysuyh36t4) -----
SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune
no_monitoring optimizer_features_enable(default)
no_parallel result_cache(snapshot=3600) */
SUM(C1) FROM (SELECT /*+ qb_name("innerQuery")
NO_INDEX_FFS( "T")  */ 1 AS C1 FROM "USER"."TABLE_NAME" "T"
WHERE ("T"."MONTH"='01') AND ("T"."UPPCODE"='51010000') AND ("T"."YEAR"='2019')) innerQuery
[TOC00004-END]
[TOC00005]
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+45        call     kgdsdst()            7FC492BAB678 000000003
                                                   7FC492B8D0B0 ? 7FC492B8D1C8 ?
                                                   7FC492BAAEA8 ? 000000083 ?
ksedst()+119         call     skdstdst()           7FC492BAB678 000000001
                                                   000000001 7FC492B8D1C8 ?
                                                   7FC492BAAEA8 ? 000000083 ?
dbkedDefDump()+1119  call     ksedst()             000000001 000000001 ?
                                                   000000001 ? 7FC492B8D1C8 ?
                                                   7FC492BAAEA8 ? 000000083 ?
ksedmp()+261         call     dbkedDefDump()       000000003 000000003
                                                   000000001 ? 7FC492B8D1C8 ?
                                                   7FC492BAAEA8 ? 000000083 ?
ssexhd()+2650        call     ksedmp()             00000044F 000000003 ?
                                                   000000001 ? 000000003
                                                   7FC492BAAEA8 ? 000000083 ?
sslsshandler()+456   call     ssexhd()             202F206C69617641
                                                   3D20296C61746F54
                                                   4D33372E38343920
                                                   3633373336202F20
                                                   6177530A4D33362E 000000083 ?
__sighandler()       call     sslsshandler()       000002000 000000000 000000000
                                                   3633373336202F20 ?
                                                   6177530A4D33362E ?
                                                   000000083 ?
qcdlgcd()+70         signal   __sighandler()       22CD5DA20 7FC492F2BC80
                                                   7FC49239EAB0 7FC492268A40 ?
                                                   2372EFCB8 ? 22CD5DA20 ?
kkdlgcd()+118        call     qcdlgcd()            22CD5DA20 ? 7FC492F2BC80 ?
                                                   7FC49239EAB0 ? 7FC492268A40 ?
                                                   2372EFCB8 ? 22CD5DA20 ?
kkmfbtic()+17        call     kkdlgcd()            7FC49239EAB0 ? 7FC492268A40 ?
                                                   7FC49239EAB0 ? 7FC492268A40 ?
                                                   2372EFCB8 ? 22CD5DA20 ?
qcsgcic()+163        call     kkmfbtic()           7FC49239EAB0 ? 7FC492268A40 ?
                                                   7FC49239EAB0 ? 7FC492268A40 ?
                                                   2372EFCB8 ? 22CD5DA20 ?
kkmgkc()+147         call     qcsgcic()            22CD5DA20 ? 7FC492F2BC80 ?
                                                   7FC49239EAB0 ? 7FC492268A40 ?
                                                   000000007 22CD5DA20 ?
kokscit()+65         call     kkmgkc()             22CD5DA20 ? 7FC492F2BC80 ?
                                                   7FC49239EAB0 ? 7FC492268A40 ?
                                                   000000007 ? 22CD5DA20 ?
qkebCreateColById()  call     kokscit()            22CD5DA20 ? 7FC492F2BC80 ?
+298                                               7FC49239EAB0 ? 7FC492268A40 ?
                                                   000000007 ? 22CD5DA20 ?
qksvcGetGuardCol()+  call     qkebCreateColById()  000000002 7FC4923A4078
161                                                7FC49239EAB0 ? 000000007 ?
                                                   7FC492268A40 ? 22CD5DA20 ?
qksvcProcessVCColum  call     qksvcGetGuardCol()   7FC49239D9D8 ? 7FC4923A4078 ?
ns()+1625                                          7FC49239EAB0 ? 000000007 ?
                                                   7FC492268A40 ? 22CD5DA20 ?
qkacol()+405         call     qksvcProcessVCColum  000000000 ? 000000000 ?
                              ns()                 7FC49239EAB0 ? 000000007 ?
                                                   7FC492268A40 ? 22CD5DA20 ?
qkadrv()+933         call     qkacol()             7FC49239EAB0 ? 000000000 ?
                                                   7FC49239EAB0 ? 000000007 ?
                                                   7FC492268A40 ? 22CD5DA20 ?
opitca()+2417        call     qkadrv()             7FC4923A4078 ? 000000001 ?
                                                   7FC49239EAB0 ? 000000007 ?
                                                   7FC492268A40 ? 22CD5DA20 ?
kksFullTypeCheck()+  call     opitca()             7FC4922F4BD0 22CD5DC40
79                                                 7FFFBD9CF3B0 000000007 ?
                                                   7FC492268A40 ? 22CD5DA20 ?
rpiswu2()+2235       call     kksFullTypeCheck()   7FFFBD9CDF08 ? 22CD5DC40 ?
                                                   7FFFBD9CF3B0 ? 000000007 ?
                                                   7FC492268A40 ? 22CD5DA20 ?
kksLoadChild()+7590  call     rpiswu2()            7FFFBD9CDF08 ? 22CD5DC40 ?
                                                   7FFFBD9CF3B0 ? 000000008 ?
                                                   7FC492F5D260 ? 22CD5DA20 ?
kxsGetRuntimeLock()  call     kksLoadChild()       7FFFBD9CDF08 ? 22CD5DC40 ?
+2155                                              000000000 000000008 ?
                                                   7FC492F5D260 ? 22CD5DA20 ?
kksfbc()+14306       call     kxsGetRuntimeLock()  7FC492F2BC80 7FC4922F4BD0
                                                   7FFFBD9CF330 22D1E7C28 ?
                                                   7FC492F5D260 ? 22D1E7C28
kkspsc0()+3146       call     kksfbc()             7FC4922F4BD0 7FC4922F4BD0 ?
                                                   7FFFBD9CF330 ? 22D1E7C28 ?
                                                   7FC492F5D260 ? 22D1E7C28 ?
kksParseCursor()+11  call     kkspsc0()            7FC4928321A0 7FC49235D198
8                                                  00000015F 000000003 000000006
                                                   000000020
opiosq0()+2210       call     kksParseCursor()     7FFFBD9D0178 ? 7FC49235D198 ?
                                                   00000015F ? 000000003 ?
                                                   000000006 ? 000000020 ?
opiall0()+4530       call     opiosq0()            000000003 7FC49235D198 ?
                                                   7FC492F5D260 ? 000000020
                                                   000000000 000000020 ?
opikpr()+567         call     opiall0()            000000003 ? 000000022
                                                   7FFFBD9D0A50 000000000
                                                   000000000 ? 000000020 ?
opiodr()+1165        call     opikpr()             000000065 ? 000000022 ?
                                                   7FFFBD9D22F0 000000000 ?
                                                   000000000 ? 000000020 ?
rpidrus()+206        call     opiodr()             000000065 00000001F
                                                   7FFFBD9D22F0 ? 000000000 ?
                                                   000000000 ? 100000001
skgmstack()+144      call     rpidrus()            7FFFBD9D1B70 00000001F ?
                                                   7FC492F2BE78 000000000 ?
                                                   000000000 ? 100000001 ?
rpiswu2()+723        call     skgmstack()          7FFFBD9D1B48 ? 7FC492F2B7A0 ?
                                                   00000F618 ? 00CBFE570 ?
                                                   7FFFBD9D1B70 ? 100000001 ?
kprball()+1163       call     rpiswu2()            7FFFBD9D1B48 ? 7FC492F2B7A0 ?
                                                   00000F618 ? 000000002 ?
                                                   7FC492F5D260 ? 100000001 ?
qksdsExeStmt()+2411  call     kprball()            7FFFBD9D22F0 004000180
                                                   00000F618 ? 000000002 ?
                                                   7FC492F5D260 ? 100000001 ?
qksdsExecute()+959   call     qksdsExeStmt()       7FFFBD9D2B98 7FC49235D150
                                                   000000001 000000008
                                                   7FFFBD9D28C8 100000001 ?
kkoatVerifyEst()+30  call     qksdsExecute()       7FFFBD9D2B98 7FC49235D150 ?
85                                                 000000001 ? 000000008 ?
                                                   7FFFBD9D28C8 ? 100000001 ?
kkeAdjSingTabCard()  call     kkoatVerifyEst()     300000001 7FC4923E9DA8
+714                                               1EE909D75846 7FC49235D108
                                                   000000000 100000001 ?
kkecdn()+3803        call     kkeAdjSingTabCard()  7FC4923E9DA8 7FC4923EA328
                                                   7FFFBD9D2E30 ? 7FC49235D108 ?
                                                   000000000 ? 100000001 ?
kkotap()+13019       call     kkecdn()             7FC4923ECF08 000000003 ?
                                                   7FFFBD9D2E30 ? 7FC4923EA328 ?
                                                   000000000 ? 100000001 ?
kkoiqb()+8331        call     kkotap()             7FC492F5D260 ? 000000000 ?
                                                   000000011 000000000 000000000
                                                   100000001 ?
kkooqb()+532         call     kkoiqb()             000000000 ? 000000000
                                                   000000000 000000000 ?
                                                   000000000 ? 100000001 ?
kkoqbc()+2385        call     kkooqb()             7FC4927F8BC0 ? 000000006
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 100000001 ?
apakkoqb()+182       call     kkoqbc()             7FC4927F8BC0 ? 7FC4927F8BC0 ?
                                                   000000001 000000000 ?
                                                   000000000 ? 100000001 ?
apaqbdDescendents()  call     apakkoqb()           000000000 ? 7FC4927F8BC0 ?
+488                                               21EE306B8 ? 000000000 ?
                                                   000000000 ? 100000001 ?
apadrv()+5383        call     apaqbdDescendents()  7FFFBD9DC2B0 ? 7FC4927F8BC0 ?
                                                   21EE306B8 ? 000000000 ?
                                                   000000000 ? 100000001 ?
opitca()+2106        call     apadrv()             21EE306B8 ? 7FC4927F8BC0 ?
                                                   21EE306B8 ? 000000000 ?
                                                   000000000 ? 100000001 ?
kksLoadChild()+7318  call     opitca()             7FC492845648 21EE306B8
                                                   7FFFBD9DE6F0 000000000 ?
                                                   000000000 ? 100000001 ?
kxsGetRuntimeLock()  call     kksLoadChild()       7FC492845648 ? 21EE306B8 ?
+2155                                              000000000 000000000 ?
                                                   000000000 ? 100000001 ?
kksfbc()+14306       call     kxsGetRuntimeLock()  7FC492F2BC80 7FC492845648
                                                   7FFFBD9DE670 232AE3E98 ?
                                                   000000000 ? 232AE3E98
kkspsc0()+3146       call     kksfbc()             7FC492845648 7FC492845648 ?
                                                   7FFFBD9DE670 ? 232AE3E98 ?
                                                   000000000 ? 232AE3E98 ?
kksParseCursor()+11  call     kkspsc0()            7FC492832108 7FFFBD9E0A28
8                                                  000000081 000000003 000000006
                                                   0000000A4
opiosq0()+2210       call     kksParseCursor()     7FFFBD9DF4B8 ? 7FFFBD9E0A28 ?
                                                   000000081 ? 000000003 ?
                                                   000000006 ? 0000000A4 ?
kpoal8()+1223        call     opiosq0()            000000003 7FFFBD9E0A28 ?
                                                   7FC492F5D260 ? 0000000A4
                                                   000000000 0000000A4 ?
opiodr()+1165        call     kpoal8()             00000005E 00000001F
                                                   7FFFBD9E3278 0000000A4 ?
                                                   000000000 ? 0000000A4 ?
ttcpip()+2699        call     opiodr()             00000005E 00000001F
                                                   7FFFBD9E3278 ? 0000000A4 ?
                                                   000000000 ? 100000000
opitsk()+1734        call     ttcpip()             7FC492F41070 ? 00000005E ?
                                                   7FFFBD9E3278 000000000 ?
                                                   7FFFBD9E2CD8 7FFFBD9E3484
opiino()+945         call     opitsk()             000000400 000000000
                                                   7FFFBD9E3278 ? 000000000 ?
                                                   7FFFBD9E2CD8 ? 7FFFBD9E3484 ?
opiodr()+1165        call     opiino()             00000003C 000000004
                                                   7FFFBD9E4918 000000000 ?
                                                   7FFFBD9E2CD8 ? 7FFFBD9E3484 ?
opidrv()+587         call     opiodr()             00000003C 000000004
                                                   7FFFBD9E4918 ? 000000000 ?
                                                   7FFFBD9E2CD8 ? 000000000
sou2o()+145          call     opidrv()             00000003C 000000004
                                                   7FFFBD9E4918 000000000 ?
                                                   7FFFBD9E2CD8 ? 000000000 ?
opimai_real()+154    call     sou2o()              7FFFBD9E48F0 00000003C
                                                   000000004 7FFFBD9E4918
                                                   7FFFBD9E2CD8 ? 000000000 ?
ssthrdmain()+412     call     opimai_real()        000000000 7FFFBD9E4C00
                                                   000000004 ? 7FFFBD9E4918 ?
                                                   7FFFBD9E2CD8 ? 000000000 ?
main()+236           call     ssthrdmain()         000000000 000000002
                                                   7FFFBD9E4C00 000000001
                                                   000000000 000000000 ?
__libc_start_main()  call     main()               7FFFBD9E559F 7FFFBD9E55AD
+253                                               7FFFBD9E4C00 ? 000000001 ?
                                                   000000000 ? 000000000 ?
_start()+41          call     __libc_start_main()  000BBD640 000000002
                                                   7FFFBD9E4E48 000000000 ?
                                                   000000000 ? 000000000 ?
[TOC00005-END]
[TOC00006]
--------------------- Binary Stack Dump ---------------------

解决方法
这里比较明显,我们可以发现这个sql执行报错,其实本质是由于递归执行动态采样报错而引起异常的.通过设置_fix_control来规避这个问题

SQL> alter session set "_fix_control"='14191778:0';
Session altered.
SQL> SELECT *
  FROM USER.TABLE_NAME t
 WHERE t.YEAR = '2019'
   AND t.UPPCODE = '51010000'
   AND  t.MONTH = '01';
  2    3    4    5
	ID EMPCODE		COMCODE 	     YEAR	QUARTER
---------- -------------------- -------------------- ---------- ----------
PUSHRATIO  D A CREATECODE	    UPDATECODE		 CREATEDATE
---------- - - -------------------- -------------------- -------------------
UPDATEDATE	    UPPCODE		 CLAIM		      MONTH
------------------- -------------------- -------------------- ----------
     77813 5101000166		51010702	     2019	1
50	   0 0 1777					 2018-07-18 17:44:19
		    51010000		 50		      01
     77909 8000545245		51010702	     2019	1
0	   0 0 1777		    1777		 2018-07-18 17:35:12
2018-07-18 17:41:20 51010000		 100		      01
	ID EMPCODE		COMCODE 	     YEAR	QUARTER
---------- -------------------- -------------------- ---------- ----------
PUSHRATIO  D A CREATECODE	    UPDATECODE		 CREATEDATE
---------- - - -------------------- -------------------- -------------------
UPDATEDATE	    UPPCODE		 CLAIM		      MONTH
------------------- -------------------- -------------------- ----------
     77912 5101000047		51010702	     2019	1
0	   0 0 1777		    1777		 2018-07-18 17:41:03
2018-07-18 17:44:28 51010000		 100		      01
SQL> SQL>

12.2 standby 报ORA-01110

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:12.2 standby 报ORA-01110

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

12.2备库报错

2018-06-13T19:29:00.302767+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 1: '/u01/app/oracle/oradata/xifenfei/system01.dbf'
2018-06-13T19:29:00.829861+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 2: '/u01/app/oracle/oradata/xifenfei/rich101.dbf'
2018-06-13T19:29:00.930632+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 3: '/u01/app/oracle/oradata/xifenfei/sysaux01.dbf'
2018-06-13T19:29:01.010230+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 4: '/u01/app/oracle/oradata/xifenfei/undotbs01.dbf'
2018-06-13T11:29:01.055975+00:00
Archived Log entry 5072 added for T-1.S-5020 ID 0x6a8e9d72 LAD:1
RFS[18]: Selected log 10 for T-1.S-5024 dbid 1787743346 branch 957530932
2018-06-13T19:29:01.091059+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 5: '/u01/app/oracle/oradata/xifenfei/richman01.dbf'
2018-06-13T19:29:01.172613+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 7: '/u01/app/oracle/oradata/xifenfei/users01.dbf'
2018-06-13T19:29:01.251906+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 8: '/u01/app/oracle/oradata/xifenfei/r_index01.dbf'

trace文件

*** 2018-06-13T19:29:00.282836+08:00
*** SESSION ID:(2281.15120) 2018-06-13T19:29:00.282868+08:00
*** CLIENT ID:() 2018-06-13T19:29:00.282873+08:00
*** SERVICE NAME:(SYS$BACKGROUND) 2018-06-13T19:29:00.282878+08:00
*** MODULE NAME:(MMON_SLAVE) 2018-06-13T19:29:00.282883+08:00
*** ACTION NAME:(DDE async action) 2018-06-13T19:29:00.282888+08:00
*** CLIENT DRIVER:() 2018-06-13T19:29:00.282892+08:00
========= Dump for error ORA 312 (no incident) ========
----- DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (Async) -----
dbkh_reactive_run_check: BEGIN
dbkh_reactive_run_check:; incident_id=0
dbkh_run_check_internal: BEGIN; check_namep=DB Structure Integrity Check, run_namep=<null>
dbkh_run_check_internal: BEGIN; timeout=0
dbkh_run_check_internal: AFTER RUN CREATE; run_id=1841
*** 2018-06-13T19:29:00.302510+08:00
DDE previous invocation failed before phase II
DDE was called in a 'No Invocation Mode'
----- Start Diag Diagnostic Dump -----
Diagnostic dump is performed due to an error in the diagfw code during error handling.
Dump error and call stack for the diagnostic dump:
*** 2018-06-13T19:29:00.302576+08:00
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=1, mask=0x0)
----- Error Stack Dump -----
ORA-01110: data file 1: '/u01/app/oracle/oradata/xifenfei/system01.dbf'
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+119         call     kgdsdst()            7FFF1A0D6C68 000000002
                                                   7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 000000082 ?
dbkedDefDump()+1200  call     ksedst()             000000000 000000002 ?
                                                   7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
ksedmp()+259         call     dbkedDefDump()       000000001 000000000
                                                   7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
dbgexExecuteIntDiag  call     ksedmp()             000000001 000000000 ?
Dmp()+1457                                         7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
dbgeBeginInvoke()+3  call     dbgexExecuteIntDiag  7F5A00000003 7F5A99B856C0
59                            Dmp()                7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
dbgePostErrorKGE()+  call     dbgeBeginInvoke()    7F5A99B856C0 7FFF1A0D7D20
1676                                               7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   7F5A99BC59A0 7F5A99AA0048
90                                                 000000456 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
kgeade()+432         call     dbkePostKGE_kgsf()   7F5A99BC59A0 7F5A99AA0048
                                                   000000456 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
kgerelv()+144        call     kgeade()             7F5A99BC59A0 ? 7F5A99BC5BE8 ?
                                                   7F5A99AA0048 ? 000000456 ?
                                                   000000000 000000000
kgerev()+36          call     kgerelv()            7F5A99BC59A0 ? 7F5A99AA0048 ?
                                                   7F5A99AA0048 ? 000000456 ?
                                                   012E79CF4 ? 000000002 ?
kserec2()+185        call     kgerev()             7F5A99BC59A0 ? 7F5A99AA0048 ?
                                                   7F5A99AA0048 ? 000000456 ?
                                                   7FFF1A0D8000 000000002 ?
kcf_record_fn()+634  call     kserec2()            7F5A99BC59A0 ? 000000000
                                                   000000001 000000001 00000002C
                                                   141E0C518
kcvvra_dfh()+5278    call     kcf_record_fn()      000000001 151622BB8 000000000
                                                   7FFF1A0DA5D8 00000002C ?
                                                   141E0C518 ?
kcidr_file_header_c  call     kcvvra_dfh()         7FFF1A0DA460 ? 7FFF1A0D9FE8 ?
heck_common()+4669                                 000000000 ? 7FFF1A0D9398
                                                   7F5A94379000 ? 000000001 ?
kcidr_file_header_a  call     kcidr_file_header_c  7F5A99A9F7A0 7F5A94379000
ll_check_common()+2           heck_common()        000000001 000000000
259                                                7F5A94379000 ? 000000000
kcidr_cross_check()  call     kcidr_file_header_a  7F5A99A9F7A0 7FFF1A0DABE4
+566                          ll_check_common()    000000001 ? 000000000 ?
                                                   7F5A94379000 ? 000000000 ?
dbkird_cross_check(  call     kcidr_cross_check()  7F5A99A9F7A0 7FFF1A0DABE4 ?
)+557                                              7F5A99BC5BE8 000000000 ?
                                                   7F5A94379000 ? 000000000 ?
dbkh_run_check_inte  call     dbkird_cross_check(  7F5A99A9F7A0 7FFF1A0DABE4 ?
rnal()+2228                   )                    7F5A99BC5BE8 ? 000000000 ?
                                                   7F5A94379000 ? 000000000 ?
dbkh_reactive_run_c  call     dbkh_run_check_inte  7FFF1A0DB970 000000000
heck()+3011                   rnal()               000000002 000000000 000000000
                                                   000000000
dbgdaAsyncReceive()  call     dbkh_reactive_run_c  7F5A99B856C0 7FFF1A0DBC90
+279                          heck()               000000002 ? 000000000 ?
                                                   000000000 ? 000000000 ?
dbgea_exec_()+1739   call     dbgdaAsyncReceive()  7F5A99B856C0 0020C0029
                                                   7FFF1A0E7CA0 7FFF1A0E7D20
                                                   000000002 000000000 ?
dbgea_exec()+621     call     dbgea_exec_()        7F5A99B856C0 7F5A94984D18
                                                   0000000E8 000000000
                                                   000000002 ? 000000000 ?
dbkea_exec()+1718    call     dbgea_exec()         7F5A99B856C0 7F5A94984D18
                                                   0000000E8 000000000
                                                   000000002 ? 000000000 ?
dbkea_slave_exec()+  call     dbkea_exec()         7F5A99B856C0 ? 7F5A94984D18 ?
518                                                0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
kebm_slave_cb()+64   call     dbkea_slave_exec()   1453D7248 7F5A94984D18 ?
                                                   0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
kebm_slave_main()+7  call     kebm_slave_cb()      1453D7248 ? 7F5A94984D18 ?
72                                                 0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
ksvrdp_int()+2010    call     kebm_slave_main()    1453D7248 ? 1453D7248
                                                   0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
opirip()+602         call     ksvrdp_int()         000000000 ? 000000000 ?
                                                   0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
opidrv()+602         call     opirip()             000000032 000000004
                                                   7FFF1A0EAD98 000000000 ?
                                                   000000002 ? 000000000 ?
sou2o()+145          call     opidrv()             000000032 000000004
                                                   7FFF1A0EAD98 000000000 ?
                                                   000000002 ? 000000000 ?
opimai_real()+202    call     sou2o()              7FFF1A0EAD70 000000032
                                                   000000004 7FFF1A0EAD98
                                                   000000002 ? 000000000 ?
ssthrdmain()+417     call     opimai_real()        000000000 7FFF1A0EB080
                                                   000000004 ? 7FFF1A0EAD98 ?
                                                   000000002 ? 000000000 ?
main()+262           call     ssthrdmain()         000000000 000000003
                                                   7FFF1A0EB080 000000001
                                                   000000000 000000000 ?
__libc_start_main()  call     main()               000000000 7FFF1A0EB2B8
+245                                               7FFF1A0EB080 ? 000000001 ?
                                                   000000000 ? 000000000 ?
_start()+41          call     __libc_start_main()  000D05240 000000001
                                                   7FFF1A0EB2B8 7F5A95015C05 ?
                                                   000000000 ? 000000000 ?
--------------------- Binary Stack Dump ---------------------

BUG:24844841 – PHSB:CDB M000 REPORTED ORA-1110 ON ADG WHEN A DATAFILE IS ADDED ON PRIMARY
@ The M000 messages is a false alarm as well. It is a false alarm by DRA check
@ that doesn’t consider standby media recovery properly. Adding a file happens
@ to trigger the timing for the false alarm.
@ One way to fix this is to skip file header check if standby recovery is
@ running inside kcidr_file_header_all_check_common.
M000进程检查数据库文件头信息,由于bug原因报ORA-01110错误.

处理建议
1.打上补丁24844841
2.19.1版本修复该问题
3.重启备库,启动mgr
4.暂时忽略该问题(目前没有发现影响数据库同步)
参考:ORA-01110 For All Files In Standby Database (Doc ID 2322290.1)

UNIFORM_LOG_TIMESTAMP_FORMAT—控制alert日志时间格式

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:UNIFORM_LOG_TIMESTAMP_FORMAT—控制alert日志时间格式

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

细心的朋友可能注意到了在12.2中oracle alert日志的时间格式显示发生了改变,根据对oracle的理解,一般新特性有event或者参数进行控制的,通过分析确定是由于UNIFORM_LOG_TIMESTAMP_FORMAT进行控制的

2018-06-13T19:52:30.014659+08:00
RFS[21]: Selected log 12 for T-1.S-5589 dbid 1787743346 branch 957530932
2018-06-13T11:52:30.473314+00:00
Archived Log entry 5178 added for T-1.S-5582 ID 0x6a8e9d72 LAD:1

UNIFORM_LOG_TIMESTAMP_FORMAT修改为false

[oracle@xff ~]$ ss
SQL*Plus: Release 12.2.0.1.0 Production on Wed Jun 13 19:52:59 2018
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
SQL> show parameter UNIFORM_LOG_TIMESTAMP_FORMAT ;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
uniform_log_timestamp_format         boolean     TRUE
SQL> alter system set uniform_log_timestamp_format=false;
System altered.

验证UNIFORM_LOG_TIMESTAMP_FORMAT=false效果

018-06-13T19:53:16.374197+08:00
RFS[19]: Selected log 14 for T-1.S-5120 dbid 1787743346 branch 957530932
2018-06-13T11:53:16.782388+00:00
Archived Log entry 5181 added for T-1.S-5117 ID 0x6a8e9d72 LAD:1
2018-06-13T19:53:19.797345+08:00
RFS[18]: Selected log 11 for T-1.S-5121 dbid 1787743346 branch 957530932
2018-06-13T11:53:20.268621+00:00
Archived Log entry 5182 added for T-1.S-5118 ID 0x6a8e9d72 LAD:1
Wed Jun 13 19:53:35 2018
ALTER SYSTEM SET uniform_log_timestamp_format=FALSE SCOPE=BOTH;
Wed Jun 13 11:53:37 2018
Wed Jun 13 19:53:37 2018
RFS[21]: Selected log 13 for T-1.S-5596 dbid 1787743346 branch 957530932
Wed Jun 13 11:53:38 2018
Archived Log entry 5183 added for T-1.S-5589 ID 0x6a8e9d72 LAD:1

UNIFORM_LOG_TIMESTAMP_FORMAT修改为true

SQL> alter system set uniform_log_timestamp_format=true;
System altered.
SQL>

验证UNIFORM_LOG_TIMESTAMP_FORMAT=true效果

Wed Jun 13 11:54:07 2018
Media Recovery Waiting for thread 1 sequence 5122 (in transit)
Wed Jun 13 11:54:07 2018
Recovery of Online Redo Log: Thread 1 Group 10 Seq 5122 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/xifenfei/std_redo10.log
2018-06-13T19:54:23.396030+08:00
ALTER SYSTEM SET uniform_log_timestamp_format=TRUE SCOPE=BOTH;
2018-06-13T11:54:24.597999+00:00
2018-06-13T19:54:24.615045+08:00
RFS[21]: Selected log 11 for T-1.S-5601 dbid 1787743346 branch 957530932
2018-06-13T11:54:25.054848+00:00
Archived Log entry 5187 added for T-1.S-5596 ID 0x6a8e9d72 LAD:1

ORA-00700: soft internal error, arguments: [kskvmstatact: excessive swapping observed]

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-00700: soft internal error, arguments: [kskvmstatact: excessive swapping observed]

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有一朋友数据库经常crash,让我帮忙分析和解决该问题
数据库版本

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE 12.1.0.2.0 Production
TNS for Linux: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production

alert日志报错信息

Mon Apr 23 02:14:18 2018
Process 0x0x10f33262f8 appears to be hung while dumping
Current time = 464149508, process death time = 464089392 interval = 60000
Called from location UNKNOWN:UNKNOWN
Attempting to kill process 0x0x10f33262f8 with OS pid = 30813
OSD kill succeeded for process 0x10f33262f8
Instance Critical Process (pid: 9, ospid: 30813, DBRM) died unexpectedly
Mon Apr 23 02:14:21 2018
System state dump requested by (instance=1, osid=30789 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_diag_30809_20180423021421.trc
Mon Apr 23 02:14:22 2018
PMON (ospid: 30789): terminating the instance due to error 56710
Mon Apr 23 02:14:22 2018
opiodr aborting process unknown ospid (27086) as a result of ORA-1092
Mon Apr 23 02:14:28 2018
Instance terminated by PMON, pid = 30789

而在类似报错之前,一般有swap不足的报错

Mon Apr 23 02:03:54 2018
WARNING: Heavy swapping observed on system in last 5 mins.
pct of memory swapped in [2.01%] pct of memory swapped out [0.51%].
Please make sure there is no memory pressure and the SGA and PGA
are configured correctly. Look at DBRM trace file for more details.
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_dbrm_30813.trc  (incident=854536):
ORA-00700: soft internal error, arguments: [kskvmstatact: excessive swapping observed], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_854536/orcl_dbrm_30813_i854536.trc
Mon Apr 23 02:04:02 2018
Dumping diagnostic data in directory=[cdmp_20180423020402], requested by (instance=1, osid=30813 (DBRM))

从这里报错看,由于系统内存不足,导致大量使用swap,从而引起oracle进程被kill

分析系统内存使用情况

[www.xifenfei.com@Oracle ~]$ more /proc/meminfo
MemTotal:       66109924 kB
MemFree:          359848 kB
Buffers:            9308 kB
Cached:          1848504 kB
SwapCached:       172800 kB
Active:          1060368 kB
Inactive:        1156100 kB
Active(anon):     999208 kB
Inactive(anon):  1104860 kB
Active(file):      61160 kB
Inactive(file):    51240 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      33554428 kB
SwapFree:       30516280 kB
Dirty:                68 kB
Writeback:             0 kB
AnonPages:        190936 kB
Mapped:          1152196 kB
Shmem:           1745380 kB
Slab:              70900 kB
SReclaimable:      25640 kB
SUnreclaim:        45260 kB
KernelStack:        5728 kB
PageTables:        92488 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    35152108 kB
Committed_AS:   70923356 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      267468 kB
VmallocChunk:   34359442996 kB
HardwareCorrupted:     0 kB
AnonHugePages:     18432 kB
HugePages_Total:   30720
HugePages_Free:    30720
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        8192 kB
DirectMap2M:     2088960 kB
DirectMap1G:    65011712 kB
[www.xifenfei.com@Oracle ~]$ free -m
             total       used       free     shared    buffers     cached
Mem:         64560      64200        359       1682          9       1801
-/+ buffers/cache:      62389       2170
Swap:        32767       2977      29790

比较明显系统总共内存64G,配置了60G大页,但是数据库没有使用该大页

数据库使用内存情况

SQL> show sga;
Total System Global Area 7.1672E+10 bytes
Fixed Size                  3719544 bytes
Variable Size            2684358280 bytes
Database Buffers         6.8719E+10 bytes
Redo Buffers              264712192 bytes

比较明显按照上述配置,一共就只有4G的空闲内存,但是oracle sga占用7G,出现大量换页是必然.错误也明显想让数据库使用大页,但是由于配置不当导致数据库无法使用大页而使用系统除大页之外的内存,从而引起系统异常.
这里也说明12c的提示有明显的改善,通过alert的错误提示基本上就可以确定是swap不足导致.

18c新特性:alter system cancel sql

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:18c新特性:alter system cancel sql

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

根据18c官方描述cancel sql功能是在18c中引起,但是实测发现在oracle 12.2中已经有了cancel sql功能,可以实现终止掉某个sql的当前sql正在执行的sql语句,而不是传统的直接kill某个会话.ALTER SYSTEM CANCEL SQL语句有四个参数分别为:
cancel_sql

--会话1
SQL> set lines 150
SQL> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production              0
PL/SQL Release 12.2.0.1.0 - Production                                                    0
CORE    12.2.0.1.0      Production                                                        0
TNS for Linux: Version 12.2.0.1.0 - Production                                            0
NLSRTL Version 12.2.0.1.0 - Production                                                    0
SQL> select sid, serial# from v$session where sid in
  2  (select  sid from v$mystat where rownum=1);
       SID    SERIAL#
---------- ----------
       278       4019
SQL> create table t_xifenfei tablespace users as select * from dba_source;
Table created.
SQL> insert into t_xifenfei select * from t_xifenfei;
274132 rows created.                    <<===没有提交
SQL> select count(*)from t_xifenfei;
  COUNT(*)
----------
    548264
SQL> insert into t_xifenfei select * from t_xifenfei;
548264 rows created.     <<===没有提交
SQL> select count(*)from t_xifenfei;
  COUNT(*)
----------
   1096528
SQL> insert into t_xifenfei select * from t_xifenfei;
--会话2
SQL> select count(*)from t_xifenfei;
  COUNT(*)
----------
    274132
SQL> alter system cancel sql '278,4019';
System altered.
SQL> select count(*)from t_xifenfei;
  COUNT(*)
----------
    274132
--会话1
SQL> insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei
*
ERROR at line 1:
ORA-01013: user requested cancel of current operation
SQL> select count(*)from t_xifenfei;
  COUNT(*)
----------
   1096528

这里可以看到会话1的最后一个insert被cancel,但是前面两个没有提交的insert没有被回滚/提交,看到了cancel sql的功能的实现.

重建oraInventory解决ORA-20001

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:重建oraInventory解决ORA-20001

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报ORA-20001: Latest xml inventory is not loaded into table错误

Completed: ALTER DATABASE OPEN
2018-01-23T23:46:27.924841+08:00
CJQ0 started with pid=54, OS id=6653
2018-01-23T23:46:31.705550+08:00
Unable to obtain current patch information due to error: 20001,
  ORA-20001: Latest xml inventory is not loaded into table
ORA-06512: at "SYS.DBMS_QOPATCH", line 777
ORA-06512: at "SYS.DBMS_QOPATCH", line 864
ORA-06512: at "SYS.DBMS_QOPATCH", line 2222
ORA-06512: at "SYS.DBMS_QOPATCH", line 740
ORA-06512: at "SYS.DBMS_QOPATCH", line 2247
===========================================================
Dumping current patch information
===========================================================
Unable to obtain current patch information due to error: 20001
===========================================================

查询相关sql报错

SYS@xffdb>select * from OPATCH_XML_INV ;
ERROR:
ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-29400: data cartridge error
KUP-04004: error while reading file /u03/app/oracle/product/12.2.0.1/dbhome/QOpatch/qopiprep.bat
no rows selected
Elapsed: 00:00:00.58
SYS@xffdb>select xmltransform(dbms_qopatch.get_opatch_lsinventory(), dbms_qopatch.GET_OPATCH_XSLT()) from dual ;
ERROR:
ORA-20001: Latest xml inventory is not loaded into table
ORA-06512: at "SYS.DBMS_QOPATCH", line 777
ORA-06512: at "SYS.DBMS_QOPATCH", line 864
ORA-06512: at "SYS.DBMS_QOPATCH", line 2222
ORA-06512: at "SYS.DBMS_QOPATCH", line 740
ORA-06512: at "SYS.DBMS_QOPATCH", line 2247
no rows selected
Elapsed: 00:00:00.63

datapatch -prereq报错

[oracle@xifenfei ~]$ $ORACLE_HOME/OPatch/datapatch -prereq
SQL Patching tool version 12.2.0.1.0 Production on Tue Jan 23 18:11:32 2018
Copyright (c) 2012, 2017, Oracle.  All rights reserved.
Connecting to database...OK
Note:  Datapatch will only apply or rollback SQL fixes for PDBs
       that are in an open state, no patches will be applied to closed PDBs.
       Please refer to Note: Datapatch: Database 12c Post Patch SQL Automation
       (Doc ID 1585822.1)
Queryable inventory could not determine the current opatch status.
Execute 'select dbms_sqlpatch.verify_queryable_inventory from dual'
and/or check the invocation log
/u03/app/oracle/cfgtoollogs/sqlpatch/sqlpatch_4909_2018_01_23_18_11_32/sqlpatch_invocation.log
for the complete error.
Prereq check failed, exiting without installing any patches.
Please refer to MOS Note 1609718.1 and/or the invocation log
/u03/app/oracle/cfgtoollogs/sqlpatch/sqlpatch_4909_2018_01_23_18_11_32/sqlpatch_invocation.log
for information on how to resolve the above errors.
SQL Patching tool complete on Tue Jan 23 18:11:45 2018

分析qopiprep.bat文件

cd $ORACLE_HOME
PATH=/bin:/usr/bin
export PATH
# sed tried to convert from one encoding to other in presence of LC_ALL
# or LANG settings. Since opatch returning UTF-8 based encoding we do not
# need such a conversion. So safely skip it
LANG=en_US.UTF-8
export LANG
LC_ALL=''
export LC_ALL
# Option: "-retry 0" avoids retries in case of locked inventory.
# Option: "-invPtrLoc" is required for non-central-inventory
# locations. $OPATCH_PREP_LSINV_OPTS which may set by users
# in the environment to configure special OPatch options
# ("-jdk" is another good candidate that may require configuration!).
# Option: "-all" gives information on all Oracle Homes
# installed in the central inventory.  With that information, the
# patches of non-RDBMS homes could be fetched.
DBSID=$ORACLE_SID
ORABASE=`$ORACLE_HOME/bin/orabasehome`
rm -rf $ORABASE/rdbms/log/xml_file_$DBSID.xml
$ORACLE_HOME/OPatch/opatch lsinventory -xml  $ORABASE/rdbms/log/xml_file_$DBSID.xml -retry 0 -invPtrLoc $ORACLE_HOME/oraInst.loc >> $ORABASE/rdbms/log/stout_$DBSID.txt
cat $ORABASE/rdbms/log/xml_file_$DBSID.xml | sed 's/^ *//' | tr '\n' ' '
echo "UIJSVTBOEIZBEFFQBL"
rm $ORABASE/rdbms/log/xml_file_$DBSID.xml
rm $ORABASE/rdbms/log/stout_$DBSID.txt

这里主要是$ORACLE_HOME/OPatch/opatch lsinventory可能异常,测试该功能
qopatch_log日志

[oracle@xifenfei ~]$ tail -f /u03/app/oracle/product/12.2.0.1/dbhome/QOpatch/qopatch_log.log
 LOG file opened at 01/23/18 18:48:55
KUP-05007:   Warning: Intra source concurrency disabled because the preprocessor option is being used.
Field Definitions for table OPATCH_XML_INV
  Record format DELIMITED BY NEWLINE
  Data in file has same endianness as the platform
  Reject rows with all null fields
  Fields in Data Source:
    XML_INVENTORY                   CHAR (100000000)
      Terminated by "UIJSVTBOEIZBEFFQBL"
      Trim whitespace same as SQL Loader
KUP-04004: error while reading file /u03/app/oracle/product/12.2.0.1/dbhome/QOpatch/qopiprep.bat
KUP-04017: OS message: Error 0
KUP-04017: OS message: LsInventorySession failed: RawInventory gets null OracleHomeInfo
cat: /u03/app/oracle/product/12.2.0.1/dbhome/rdbms/log/xml_file_xffdb.xml: No such file or direc
KUP-04118: operation "pipe read", location "skudmir:2"

opatch lsinventory验证

[oracle@xifenfei ~]$ /u03/app/oracle/product/12.2.0.1/dbhome/OPatch/opatch lsinventory
Oracle Interim Patch Installer version 12.2.0.1.6
Copyright (c) 2018, Oracle Corporation.  All rights reserved.
Oracle Home       : /u03/app/oracle/product/12.2.0.1/dbhome
Central Inventory : /u01/app/oraInventory
   from           : /u03/app/oracle/product/12.2.0.1/dbhome/oraInst.loc
OPatch version    : 12.2.0.1.6
OUI version       : 12.2.0.1.4
Log file location : /u03/app/oracle/product/12.2.0.1/dbhome/cfgtoollogs/opatch/opatch2018-01-23_23-50-29PM_1.log
List of Homes on this system:
  Home name= OraDB12Home1, Location= "/u01/app/oracle/product/12.2.0/dbhome_1"
LsInventorySession failed: RawInventory gets null OracleHomeInfo
OPatch failed with error code 73

现在到这一步,可以确定判断opatch lsinventory运行异常,导致DBMS_QOPATCH无法正常工作,而引起opatch异常的原因是由于RawInventory gets null OracleHomeInfo
分析inventory.xml 文件

[oracle@xifenfei ContentsXML]$ cat inventory.xml
<?xml version="1.0" standalone="yes" ?>
<!-- Copyright (c) 1999, 2016, Oracle and/or its affiliates.
All rights reserved. -->
<!-- Do not modify the contents of this file by hand. -->
<INVENTORY>
<VERSION_INFO>
   <SAVED_WITH>12.2.0.1.0</SAVED_WITH>
   <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>
</VERSION_INFO>
<HOME_LIST>
<HOME NAME="OraDB12Home1" LOC="/u01/app/oracle/product/12.2.0/dbhome_1" TYPE="O" IDX="1"/>
</HOME_LIST>
<COMPOSITEHOME_LIST>
</COMPOSITEHOME_LIST>
</INVENTORY>

因为该机器上安装过三个版本的oracle,12.2 beta,11.2.0.4,12.2.0.1,现在oracle home只有第一个beta的,因此这个部分肯定异常,导致后面的12.2正式版无法获取到oraclehome
重建oraInventory

[oracle@xifenfei app]$ cd $ORACLE_HOME/oui/bin
[oracle@xifenfei bin]$ ./runInstaller -silent -ignoreSysPrereqs -attachHome ORACLE_HOME="/u01/app/oracle/product/12.2.0/dbhome_1" ORACLE_HOME_NAME="OraDB12betaHome1"
Starting Oracle Universal Installer...
Checking swap space: must be greater than 500 MB.   Actual 3935 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
'AttachHome' was successful.
[oracle@xifenfei bin]$ ./runInstaller -silent -ignoreSysPrereqs -attachHome ORACLE_HOME="/u02/app/oracle/product/11.2.0.4/dbhome" ORACLE_HOME_NAME="OraDb11g_home1"
Starting Oracle Universal Installer...
Checking swap space: must be greater than 500 MB.   Actual 3935 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
'AttachHome' was successful.
[oracle@xifenfei bin]$ ./runInstaller -silent -ignoreSysPrereqs -attachHome ORACLE_HOME="/u03/app/oracle/product/12.2.0.1/dbhome" ORACLE_HOME_NAME="OraDb122g_home1"
Starting Oracle Universal Installer...
Checking swap space: must be greater than 500 MB.   Actual 3935 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
'AttachHome' was successful.
--验证inventory.xml 文件
[oracle@xifenfei ContentsXML]$ cat inventory.xml
<?xml version="1.0" standalone="yes" ?>
<!-- Copyright (c) 1999, 2018, Oracle and/or its affiliates.
All rights reserved. -->
<!-- Do not modify the contents of this file by hand. -->
<INVENTORY>
<VERSION_INFO>
   <SAVED_WITH>12.2.0.1.4</SAVED_WITH>
   <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>
</VERSION_INFO>
<HOME_LIST>
<HOME NAME="OraDB12betaHome1" LOC="/u01/app/oracle/product/12.2.0/dbhome_1" TYPE="O" IDX="1"/>
<HOME NAME="OraDb11g_home1" LOC="/u02/app/oracle/product/11.2.0.4/dbhome" TYPE="O" IDX="2"/>
<HOME NAME="OraDb122g_home1" LOC="/u03/app/oracle/product/12.2.0.1/dbhome" TYPE="O" IDX="3"/>
</HOME_LIST>
<COMPOSITEHOME_LIST>
</COMPOSITEHOME_LIST>
</INVENTORY>

验证opatch lsinventory

[oracle@xifenfei bin]$ opatch lsinv
Oracle Interim Patch Installer version 12.2.0.1.6
Copyright (c) 2018, Oracle Corporation.  All rights reserved.
Oracle Home       : /u03/app/oracle/product/12.2.0.1/dbhome
Central Inventory : /u01/app/oraInventory
   from           : /u03/app/oracle/product/12.2.0.1/dbhome/oraInst.loc
OPatch version    : 12.2.0.1.6
OUI version       : 12.2.0.1.4
Log file location : /u03/app/oracle/product/12.2.0.1/dbhome/cfgtoollogs/opatch/opatch2018-01-24_00-19-55AM_1.log
Lsinventory Output file location : /u03/app/oracle/product/12.2.0.1/dbhome/cfgtoollogs/opatch/lsinv/lsinventory2018-01-24_00-19-55AM.txt
--------------------------------------------------------------------------------
Local Machine Information::
Hostname: xifenfei
ARU platform id: 226
ARU platform description:: Linux x86-64
Installed Top-level Products (1):
Oracle Database 12c                                                  12.2.0.1.0
There are 1 products installed in this Oracle Home.
There are no Interim patches installed in this Oracle Home.
--------------------------------------------------------------------------------
OPatch succeeded.

验证dbms_qopatch工作正常

[oracle@xifenfei ContentsXML]$ $ORACLE_HOME/OPatch/datapatch -prereq
SQL Patching tool version 12.2.0.1.0 Production on Wed Jan 24 00:21:48 2018
Copyright (c) 2012, 2017, Oracle.  All rights reserved.
Connecting to database...OK
Note:  Datapatch will only apply or rollback SQL fixes for PDBs
       that are in an open state, no patches will be applied to closed PDBs.
       Please refer to Note: Datapatch: Database 12c Post Patch SQL Automation
       (Doc ID 1585822.1)
Determining current state...done
Adding patches to installation queue and performing prereq checks...done
Installation queue:
  For the following PDBs: CDB$ROOT PDB$SEED
    Nothing to roll back
    Nothing to apply
SQL Patching tool complete on Wed Jan 24 00:21:55 2018
SYS@xffdb>select dbms_sqlpatch.verify_queryable_inventory from dual;
VERIFY_QUERYABLE_INVENTORY
--------------------------------------------------------------------------------------------------
OK
Elapsed: 00:00:01.03
SYS@xffdb>select xmltransform(dbms_qopatch.get_opatch_lsinventory(), dbms_qopatch.GET_OPATCH_XSLT()) from dual ;
XMLTRANSFORM(DBMS_QOPATCH.GET_OPATCH_LSINVENTORY(),DBMS_QOPATCH.GET_OPATCH_XSLT())
----------------------------------------------------------------------------------------------------------
Oracle Querayable Patch Interface 1.0
-----------------------------------------
Elapsed: 00:00:01.09
SYS@xffdb>

通过修复错误的oraInventory解决ORA-20001问题