Oracle dul支持18c

在以前的文章中已经写过oracle 原厂dul工具可以很好的支持oracle 11g,12c(Oracle dul支持Oracle 12.2(12c),dul 10支持oracle 11g r2),现在确认通过一些处理,dul也可以完美支持oracle 18c
数据库版本18c

[oracle@localhost dul]$ sqlplus / as sysdba
SQL*Plus: Release 18.0.0.0.0 Production on Fri Jun 15 14:23:00 2018
Version 18.1.0.0.0
Copyright (c) 1982, 2017, Oracle.  All rights reserved.
Connected to:
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.1.0.0.0
SQL> select BANNER from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
SQL> select name from v$datafile;
NAME
-------------------------------------------------------------
/u02/app/oracle/oradata/XFFDB/system01.dbf
/u02/app/oracle/oradata/XFFDB/sysaux01.dbf
/u02/app/oracle/oradata/XFFDB/undotbs01.dbf
/u02/app/oracle/oradata/XFFDB/users01.dbf
SQL> create table t_xifenfei as select * from dba_objects;
Table created.
SQL> alter system checkpoint;
System altered.
SQL> select count(*) from sys.t_xifenfei;
COUNT(*)
-------------------------------------------------------------
72870

dul加载18c字典失败
DUL: FATAL Error: File OBJ.dat和DUL: Error: string2ub8错误导致obj和TAB字典加载失败

[oracle@localhost dul]$ ./dul
Data UnLoader: 11.2.0.6.1 - Internal Only - on Fri Jun 15 12:04:17 2018
with 64-bit io functions and the decompression option
Copyright (c) 1994 2017 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
DUL: Warning: ulimit process stack size is only 33554432
Found db_id = 946715269
Found db_name = XFFDB
DUL> show datafiles;
ts# rf# start   blocks offs open  err file name
  0   1     0   110081    0    1    0 /u02/app/oracle/oradata/XFFDB/system01.dbf
  1   3     0    79361    0    1    0 /u02/app/oracle/oradata/XFFDB/sysaux01.dbf
  2   4     0     7681    0    1    0 /u02/app/oracle/oradata/XFFDB/undotbs01.dbf
  4   7     0      641    0    1    0 /u02/app/oracle/oradata/XFFDB/users01.dbf
DUL> bootstrap;
Probing file = 1, block = 520
. unloading table                BOOTSTRAP$
DUL: Warning: block number is non zero but marked deferred trying to process it anyhow
      60 rows unloaded
Reading BOOTSTRAP.dat 60 entries loaded
Parsing Bootstrap$ contents
Generating dict.ddl for version 11
 OBJ$: segobjno 18, file 1 block 240
 TAB$: segobjno 2, tabno 1, file 1  block 144
 COL$: segobjno 2, tabno 5, file 1  block 144
 USER$: segobjno 10, tabno 1, file 1  block 208
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$   72872 rows unloaded
. unloading table                      TAB$    2173 rows unloaded
. unloading table                      COL$  120489 rows unloaded
. unloading table                     USER$     123 rows unloaded
Reading USER.dat 123 entries loaded
Reading OBJ.dat
DUL: FATAL Error: File OBJ.dat, line 20174: identifier too long
[oracle@localhost dul]$ ./dul
Data UnLoader: 11.2.0.6.1 - Internal Only - on Fri Jun 15 13:46:51 2018
with 64-bit io functions and the decompression option
Copyright (c) 1994 2017 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
DUL: Warning: Recreating file "dul.log"
DUL: Warning: ulimit process stack size is only 33554432
Reading USER.dat 123 entries loaded
Reading OBJ.dat 35926 entries loaded and sorted 35926 entries
Reading TAB.dat
DUL: Error: string2ub8(618970019642690137449563136), Conversion to number (ub8) overflowed
DUL: Error: Number conversion error in file TAB.dat, line 22
DUL: Warning: Ignoring file TAB.dat cache
Reading COL.dat
DUL: Notice: Increased the size of DC_COLUMNS from 100000 to 132768 entries
 120489 entries loaded and sorted 120489 entries
Reading BOOTSTRAP.dat 60 entries loaded
Found db_id = 946715269
Found db_name = XFFDB

通过dul修复字典之后

[oracle@localhost dul]$ ./dul
Data UnLoader: 11.2.0.6.1 - Internal Only - on Fri Jun 15 14:12:52 2018
with 64-bit io functions and the decompression option
Copyright (c) 1994 2017 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
DUL: Warning: Recreating file "dul.log"
DUL: Warning: ulimit process stack size is only 33554432
Reading USER.dat 123 entries loaded
Reading OBJ.dat 35926 entries loaded and sorted 35926 entries
Reading TAB.dat 2155 entries loaded
Reading COL.dat
DUL: Notice: Increased the size of DC_COLUMNS from 100000 to 132768 entries
 120489 entries loaded and sorted 120489 entries
Reading TABPART.dat 299 entries loaded and sorted 299 entries
Reading TABCOMPART.dat 1 entries loaded and sorted 1 entries
Reading TABSUBPART.dat 32 entries loaded and sorted 32 entries
Reading INDPART.dat 216 entries loaded and sorted 216 entries
Reading INDCOMPART.dat 0 entries loaded and sorted 0 entries
Reading INDSUBPART.dat 0 entries loaded and sorted 0 entries
Reading IND.dat 2845 entries loaded
Reading LOB.dat 665 entries loaded
Reading ICOL.dat 4911 entries loaded
Reading COLTYPE.dat 2971 entries loaded
Reading TYPE.dat 4031 entries loaded
Reading ATTRIBUTE.dat 15856 entries loaded
Reading COLLECTION.dat
DUL: Notice: Increased the size of DC_COLLECTIONS from 1024 to 8192 entries
 1454 entries loaded
Reading BOOTSTRAP.dat 60 entries loaded
Reading LOBFRAG.dat 18 entries loaded and sorted 18 entries
Reading LOBCOMPPART.dat 0 entries loaded and sorted 0 entries
Reading UNDO.dat 21 entries loaded
Reading TS.dat 6 entries loaded
Reading PROPS.dat 42 entries loaded
Database character set is AL32UTF8
Database national character set is AL16UTF16
Found db_id = 946715269
Found db_name = XFFDB
DUL> unload table sys.t_xifenfei;
. unloading table                T_XIFENFEI   72870 rows unloaded
DUL>

由此可以看出来dul,可以比较好的支持oracle 18c数据库

12.2 standby 报ORA-01110

12.2备库报错

2018-06-13T19:29:00.302767+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 1: '/u01/app/oracle/oradata/xifenfei/system01.dbf'
2018-06-13T19:29:00.829861+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 2: '/u01/app/oracle/oradata/xifenfei/rich101.dbf'
2018-06-13T19:29:00.930632+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 3: '/u01/app/oracle/oradata/xifenfei/sysaux01.dbf'
2018-06-13T19:29:01.010230+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 4: '/u01/app/oracle/oradata/xifenfei/undotbs01.dbf'
2018-06-13T11:29:01.055975+00:00
Archived Log entry 5072 added for T-1.S-5020 ID 0x6a8e9d72 LAD:1
RFS[18]: Selected log 10 for T-1.S-5024 dbid 1787743346 branch 957530932
2018-06-13T19:29:01.091059+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 5: '/u01/app/oracle/oradata/xifenfei/richman01.dbf'
2018-06-13T19:29:01.172613+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 7: '/u01/app/oracle/oradata/xifenfei/users01.dbf'
2018-06-13T19:29:01.251906+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 8: '/u01/app/oracle/oradata/xifenfei/r_index01.dbf'

trace文件

*** 2018-06-13T19:29:00.282836+08:00
*** SESSION ID:(2281.15120) 2018-06-13T19:29:00.282868+08:00
*** CLIENT ID:() 2018-06-13T19:29:00.282873+08:00
*** SERVICE NAME:(SYS$BACKGROUND) 2018-06-13T19:29:00.282878+08:00
*** MODULE NAME:(MMON_SLAVE) 2018-06-13T19:29:00.282883+08:00
*** ACTION NAME:(DDE async action) 2018-06-13T19:29:00.282888+08:00
*** CLIENT DRIVER:() 2018-06-13T19:29:00.282892+08:00
========= Dump for error ORA 312 (no incident) ========
----- DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (Async) -----
dbkh_reactive_run_check: BEGIN
dbkh_reactive_run_check:; incident_id=0
dbkh_run_check_internal: BEGIN; check_namep=DB Structure Integrity Check, run_namep=<null>
dbkh_run_check_internal: BEGIN; timeout=0
dbkh_run_check_internal: AFTER RUN CREATE; run_id=1841
*** 2018-06-13T19:29:00.302510+08:00
DDE previous invocation failed before phase II
DDE was called in a 'No Invocation Mode'
----- Start Diag Diagnostic Dump -----
Diagnostic dump is performed due to an error in the diagfw code during error handling.
Dump error and call stack for the diagnostic dump:
*** 2018-06-13T19:29:00.302576+08:00
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=1, mask=0x0)
----- Error Stack Dump -----
ORA-01110: data file 1: '/u01/app/oracle/oradata/xifenfei/system01.dbf'
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+119         call     kgdsdst()            7FFF1A0D6C68 000000002
                                                   7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 000000082 ?
dbkedDefDump()+1200  call     ksedst()             000000000 000000002 ?
                                                   7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
ksedmp()+259         call     dbkedDefDump()       000000001 000000000
                                                   7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
dbgexExecuteIntDiag  call     ksedmp()             000000001 000000000 ?
Dmp()+1457                                         7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
dbgeBeginInvoke()+3  call     dbgexExecuteIntDiag  7F5A00000003 7F5A99B856C0
59                            Dmp()                7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
dbgePostErrorKGE()+  call     dbgeBeginInvoke()    7F5A99B856C0 7FFF1A0D7D20
1676                                               7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   7F5A99BC59A0 7F5A99AA0048
90                                                 000000456 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
kgeade()+432         call     dbkePostKGE_kgsf()   7F5A99BC59A0 7F5A99AA0048
                                                   000000456 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
kgerelv()+144        call     kgeade()             7F5A99BC59A0 ? 7F5A99BC5BE8 ?
                                                   7F5A99AA0048 ? 000000456 ?
                                                   000000000 000000000
kgerev()+36          call     kgerelv()            7F5A99BC59A0 ? 7F5A99AA0048 ?
                                                   7F5A99AA0048 ? 000000456 ?
                                                   012E79CF4 ? 000000002 ?
kserec2()+185        call     kgerev()             7F5A99BC59A0 ? 7F5A99AA0048 ?
                                                   7F5A99AA0048 ? 000000456 ?
                                                   7FFF1A0D8000 000000002 ?
kcf_record_fn()+634  call     kserec2()            7F5A99BC59A0 ? 000000000
                                                   000000001 000000001 00000002C
                                                   141E0C518
kcvvra_dfh()+5278    call     kcf_record_fn()      000000001 151622BB8 000000000
                                                   7FFF1A0DA5D8 00000002C ?
                                                   141E0C518 ?
kcidr_file_header_c  call     kcvvra_dfh()         7FFF1A0DA460 ? 7FFF1A0D9FE8 ?
heck_common()+4669                                 000000000 ? 7FFF1A0D9398
                                                   7F5A94379000 ? 000000001 ?
kcidr_file_header_a  call     kcidr_file_header_c  7F5A99A9F7A0 7F5A94379000
ll_check_common()+2           heck_common()        000000001 000000000
259                                                7F5A94379000 ? 000000000
kcidr_cross_check()  call     kcidr_file_header_a  7F5A99A9F7A0 7FFF1A0DABE4
+566                          ll_check_common()    000000001 ? 000000000 ?
                                                   7F5A94379000 ? 000000000 ?
dbkird_cross_check(  call     kcidr_cross_check()  7F5A99A9F7A0 7FFF1A0DABE4 ?
)+557                                              7F5A99BC5BE8 000000000 ?
                                                   7F5A94379000 ? 000000000 ?
dbkh_run_check_inte  call     dbkird_cross_check(  7F5A99A9F7A0 7FFF1A0DABE4 ?
rnal()+2228                   )                    7F5A99BC5BE8 ? 000000000 ?
                                                   7F5A94379000 ? 000000000 ?
dbkh_reactive_run_c  call     dbkh_run_check_inte  7FFF1A0DB970 000000000
heck()+3011                   rnal()               000000002 000000000 000000000
                                                   000000000
dbgdaAsyncReceive()  call     dbkh_reactive_run_c  7F5A99B856C0 7FFF1A0DBC90
+279                          heck()               000000002 ? 000000000 ?
                                                   000000000 ? 000000000 ?
dbgea_exec_()+1739   call     dbgdaAsyncReceive()  7F5A99B856C0 0020C0029
                                                   7FFF1A0E7CA0 7FFF1A0E7D20
                                                   000000002 000000000 ?
dbgea_exec()+621     call     dbgea_exec_()        7F5A99B856C0 7F5A94984D18
                                                   0000000E8 000000000
                                                   000000002 ? 000000000 ?
dbkea_exec()+1718    call     dbgea_exec()         7F5A99B856C0 7F5A94984D18
                                                   0000000E8 000000000
                                                   000000002 ? 000000000 ?
dbkea_slave_exec()+  call     dbkea_exec()         7F5A99B856C0 ? 7F5A94984D18 ?
518                                                0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
kebm_slave_cb()+64   call     dbkea_slave_exec()   1453D7248 7F5A94984D18 ?
                                                   0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
kebm_slave_main()+7  call     kebm_slave_cb()      1453D7248 ? 7F5A94984D18 ?
72                                                 0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
ksvrdp_int()+2010    call     kebm_slave_main()    1453D7248 ? 1453D7248
                                                   0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
opirip()+602         call     ksvrdp_int()         000000000 ? 000000000 ?
                                                   0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
opidrv()+602         call     opirip()             000000032 000000004
                                                   7FFF1A0EAD98 000000000 ?
                                                   000000002 ? 000000000 ?
sou2o()+145          call     opidrv()             000000032 000000004
                                                   7FFF1A0EAD98 000000000 ?
                                                   000000002 ? 000000000 ?
opimai_real()+202    call     sou2o()              7FFF1A0EAD70 000000032
                                                   000000004 7FFF1A0EAD98
                                                   000000002 ? 000000000 ?
ssthrdmain()+417     call     opimai_real()        000000000 7FFF1A0EB080
                                                   000000004 ? 7FFF1A0EAD98 ?
                                                   000000002 ? 000000000 ?
main()+262           call     ssthrdmain()         000000000 000000003
                                                   7FFF1A0EB080 000000001
                                                   000000000 000000000 ?
__libc_start_main()  call     main()               000000000 7FFF1A0EB2B8
+245                                               7FFF1A0EB080 ? 000000001 ?
                                                   000000000 ? 000000000 ?
_start()+41          call     __libc_start_main()  000D05240 000000001
                                                   7FFF1A0EB2B8 7F5A95015C05 ?
                                                   000000000 ? 000000000 ?
--------------------- Binary Stack Dump ---------------------

BUG:24844841 – PHSB:CDB M000 REPORTED ORA-1110 ON ADG WHEN A DATAFILE IS ADDED ON PRIMARY
@ The M000 messages is a false alarm as well. It is a false alarm by DRA check
@ that doesn’t consider standby media recovery properly. Adding a file happens
@ to trigger the timing for the false alarm.
@ One way to fix this is to skip file header check if standby recovery is
@ running inside kcidr_file_header_all_check_common.
M000进程检查数据库文件头信息,由于bug原因报ORA-01110错误.

处理建议
1.打上补丁24844841
2.19.1版本修复该问题
3.重启备库,启动mgr
4.暂时忽略该问题(目前没有发现影响数据库同步)
参考:ORA-01110 For All Files In Standby Database (Doc ID 2322290.1)

UNIFORM_LOG_TIMESTAMP_FORMAT—控制alert日志时间格式

细心的朋友可能注意到了在12.2中oracle alert日志的时间格式显示发生了改变,根据对oracle的理解,一般新特性有event或者参数进行控制的,通过分析确定是由于UNIFORM_LOG_TIMESTAMP_FORMAT进行控制的

2018-06-13T19:52:30.014659+08:00
RFS[21]: Selected log 12 for T-1.S-5589 dbid 1787743346 branch 957530932
2018-06-13T11:52:30.473314+00:00
Archived Log entry 5178 added for T-1.S-5582 ID 0x6a8e9d72 LAD:1

UNIFORM_LOG_TIMESTAMP_FORMAT修改为false

[oracle@xff ~]$ ss
SQL*Plus: Release 12.2.0.1.0 Production on Wed Jun 13 19:52:59 2018
Copyright (c) 1982, 2016, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
SQL> show parameter UNIFORM_LOG_TIMESTAMP_FORMAT ;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
uniform_log_timestamp_format         boolean     TRUE
SQL> alter system set uniform_log_timestamp_format=false;
System altered.

验证UNIFORM_LOG_TIMESTAMP_FORMAT=false效果

018-06-13T19:53:16.374197+08:00
RFS[19]: Selected log 14 for T-1.S-5120 dbid 1787743346 branch 957530932
2018-06-13T11:53:16.782388+00:00
Archived Log entry 5181 added for T-1.S-5117 ID 0x6a8e9d72 LAD:1
2018-06-13T19:53:19.797345+08:00
RFS[18]: Selected log 11 for T-1.S-5121 dbid 1787743346 branch 957530932
2018-06-13T11:53:20.268621+00:00
Archived Log entry 5182 added for T-1.S-5118 ID 0x6a8e9d72 LAD:1
Wed Jun 13 19:53:35 2018
ALTER SYSTEM SET uniform_log_timestamp_format=FALSE SCOPE=BOTH;
Wed Jun 13 11:53:37 2018
Wed Jun 13 19:53:37 2018
RFS[21]: Selected log 13 for T-1.S-5596 dbid 1787743346 branch 957530932
Wed Jun 13 11:53:38 2018
Archived Log entry 5183 added for T-1.S-5589 ID 0x6a8e9d72 LAD:1

UNIFORM_LOG_TIMESTAMP_FORMAT修改为true

SQL> alter system set uniform_log_timestamp_format=true;
System altered.
SQL>

验证UNIFORM_LOG_TIMESTAMP_FORMAT=true效果

Wed Jun 13 11:54:07 2018
Media Recovery Waiting for thread 1 sequence 5122 (in transit)
Wed Jun 13 11:54:07 2018
Recovery of Online Redo Log: Thread 1 Group 10 Seq 5122 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/xifenfei/std_redo10.log
2018-06-13T19:54:23.396030+08:00
ALTER SYSTEM SET uniform_log_timestamp_format=TRUE SCOPE=BOTH;
2018-06-13T11:54:24.597999+00:00
2018-06-13T19:54:24.615045+08:00
RFS[21]: Selected log 11 for T-1.S-5601 dbid 1787743346 branch 957530932
2018-06-13T11:54:25.054848+00:00
Archived Log entry 5187 added for T-1.S-5596 ID 0x6a8e9d72 LAD:1

ORA-604 ORA-607 ORA-600

有客户数据库经过一系列恢复之后,出现ORA-604 ORA-607 ORA-600的错误,尝试各种方法无法打开,希望我们介入处理

Sat May 12 21:18:56 2018
SMON: enabling cache recovery
Sat May 12 21:18:57 2018
Errors in file d:\oracle\admin\xifenfei\udump\xifenfei_ora_3448.trc:
ORA-00600: 内部错误代码,参数: [4194], [34], [28], [], [], [], [], []
Sat May 12 21:19:00 2018
Recovery of Online Redo Log: Thread 1 Group 1 Seq 644 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\xifenfei\REDO01.LOG
Recovery of Online Redo Log: Thread 1 Group 1 Seq 644 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\xifenfei\REDO01.LOG
Sat May 12 21:19:01 2018
Errors in file d:\oracle\admin\xifenfei\udump\xifenfei_ora_3448.trc:
ORA-00604: 递归 SQL 层 1 出现错误
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码,参数: [4194], [34], [28], [], [], [], [], []
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Sat May 12 21:19:02 2018
Errors in file d:\oracle\admin\xifenfei\bdump\xifenfei_pmon_1840.trc:
ORA-00604: error occurred at recursive SQL level
Instance terminated by USER, pid = 3448
ORA-1092 signalled during: alter database open...

ORA-600 4194 trace文件
分析trace文件,确定ORA-600 4194对应的sql语句为update undo$ set name=:2……

*** 2018-05-12 21:18:57.000
ksedmp: internal or fatal error
ORA-00600: 内部错误代码,参数: [4194], [34], [28], [], [], [], [], []
Current SQL statement for this session:
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,
xactsqn=:8,scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
_ksedmp+147          CALLrel  _ksedst+0
_ksfdmp.108+e        CALLrel  _ksedmp+0            3
_kgeriv+89           CALLreg  00000000             212778 3
_kseipre.107+3f      CALLrel  _kgeriv+0
_ksesic2+24          CALLrel  _kseipre.107+0
__VInfreq__kturdb+8  CALLrel  _ksesic2+0           1062 0 22 0 1C
b
_kcoapl+1df          CALLreg  00000000             30A0F94 30A100A 11 6BB7E014
_kcbapl+71           CALLrel  _kcoapl+0            30A0F90 6BB7E000 1 0 2000
_kcrfwr+734          CALLrel  _kcbapl+0            30A0F90 6BBFC844 3014F9C
_kcbchg1+7ec         CALLrel  _kcrfwr+0
_ktuchg+630          CALLrel  _kcbchg1+0           0 4 3015224 301523C 0 0
_ktbchg2+75          CALLrel  _ktuchg+0            2 672D50F4 1 310DCD8 310DCE0
                                                   30A0F90 310D2F0 30A0ED0 0 0
_kddchg+18f          CALLrel  _ktbchg2+0           0 672D50F4 310DCD8 310DCE0
                                                   30A0F90 310D2E8 30A0ED0 0 0
_kduovw.53+6e3       CALLrel  _kddchg+0            310D2AC 310DCD8 310DCE0
                                                   30A0F90 30A0ED0 0 0
_kduurp.53+61a       CALLrel  _kduovw.53+0         310D2AC
_kdusru+aa5          CALLrel  _kduurp.53+0         310D2AC 672D514C
_kauupd+12e          CALLrel  _kdusru+0            310D6E0 672D514C 310D2AC 0
_updrow+729          CALLrel  _kauupd+0            310D6DC 672D514C 310D2AC 0
                                                   672D539C E F 672E4E48 12
                                                   301BBA0 301BBA4
_qerupFetch+107      CALLrel  _updrow+0
_updaul+202          CALL???  00000000             672DE9C4 0 672EC040 7FFF
_updThreePhaseExe+b  CALLrel  _updaul+0            672EBDD4 301BD30 0
6
_updexe+105          CALLrel  _updThreePhaseExe+0  672EBDD4 0 310D2AC 301BE0C
                                                   672EBDD4 1 301BE0C 0
_opiexe+f97          CALLrel  _updexe+0            672EBDD4 301BF48
_opiodr+4cd          CALLreg  00000000             4 3 301C894
_rpidrus.43+99       CALLrel  _opiodr+0            4 3 301C894 B
_skgmstack+71        CALLreg  00000000             301C484
_rpidru+6d           CALLrel  _skgmstack+0         301C49C 212600 F618 778198
                                                   301C484
_rpiswu2+17e         CALLreg  00000000             301C7BC
_rpidrv+109          CALLrel  _rpiswu2+0
_rpiexe+33           CALLrel  _rpidrv+0            B 4 301C894 8
_ktuscu+2a8          CALLrel  _rpiexe+0            B
_kqrcmt+2c2          CALL???  00000000             672EA898 3
..1.18_2.filter.95+  CALLrel  _kqrcmt+0            67B9C5F4 1 0 212778 212778 FF
159                                                0 0 0
..1.23_5.filter.99+  CALLrel  _ktcrcm+0            67B9C5F4 0 0 0 0 1 0 0
14d
_ktuini+64           CALLrel  _ktuiup.99+0         301D990
_adbdrv+2665         CALLrel  _ktuini+0            301D990
..1.5_1.filter.29+2  CALLrel  _adbdrv+0
9d
_opiosq0+9a4         CALLrel  _opiexe+0            4 0 301DDD8
_kpooprx+c6          CALLrel  _opiosq0+0           3 E 301DE70 24
_kpoal8+225          CALLrel  _kpooprx+0           301E738 301E680 13 1 0 24
_opiodr+4cd          CALLreg  00000000             5E 14 301E734
_ttcpip+a86          CALLreg  00000000             5E 14 301E734 0
_opitsk+2f4          CALLrel  _ttcpip+0
_opiino+5fc          CALLrel  _opitsk+0            0 0 2188C8 30CF020 E6 0
_opiodr+4cd          CALLreg  00000000             3C 4 301FBD4
_opidrv+233          CALLrel  _opiodr+0            3C 4 301FBD4 0
_sou2o+19            CALLrel  _opidrv+0
_opimai+10a          CALLrel  _sou2o+0
_OracleThreadStart@  CALLrel  _opimai+0
4+35c
7C80B726             CALLreg  00000000
--------------------- Binary Stack Dump ---------------------

进一步分析确定为system rollback segment header 异常

Block image after block recovery:
buffer tsn: 0 rdba: 0x00400009 (1/9)
scn: 0x0000.d794070f seq: 0x01 flg: 0x04 tail: 0x070f0e01
frmt: 0x02 chkval: 0x2320 type: 0x0e=KTU UNDO HEADER W/UNLIMITED EXTENTS
  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      spare2: 0      #extents: 6      #blocks: 47
                  last map  0x00000000  #maps: 0      offset: 4128
      Highwater::  0x00400183  ext#: 2      blk#: 2      ext size: 8
  #blocks in seg. hdr's freelists: 0
  #blocks below: 0
  mapblk  0x00000000  offset: 2
                   Unlocked
     Map Header:: next  0x00000000  #extents: 6    obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x0040000a  length: 7
   0x00400011  length: 8
   0x00400181  length: 8
   0x00400189  length: 8
   0x00400191  length: 8
   0x00400199  length: 8
  TRN CTL:: seq: 0x0056 chd: 0x0054 ctl: 0x0052 inc: 0x00000000 nfb: 0x0001
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00400183.0056.1b scn: 0x0000.d77996ab
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00400183.0056.1b ext: 0x2  spc: 0x794
    uba: 0x00000000.002f.21 ext: 0x5  spc: 0x1334
    uba: 0x00000000.002e.37 ext: 0x4  spc: 0x788
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
…………
ORA-00604: 递归 SQL 层 1 出现错误
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码,参数: [4194], [34], [28], [], [], [], [], []

这种问题需要通过通过bbed/ue修改ktuxc的相关内容,实现数据库open成功,可以参考另外几篇文章:
使用bbed解决ORA-00607/ORA-00600[4194]故障
通过bbed模拟ORA-00607/ORA-00600 4194 故障
ORA-607/ORA-600[4194]不一定是重大灾难
数据库报ORA-00607/ORA-00600[4194]错误

Automatic datafile offline due to write error on

由于存储突然掉线导致数据文件无法访问,导致部分数据文件被自动offline

Thu May 17 14:49:03 2018
KCF: read, write or open error, block=0xe93b8 online=1
Thu May 17 14:49:03 2018
KCF: read, write or open error, block=0x24eb65 online=1
        file=25 'F:\ORACLE\ORADATA\ORCL\QYSCZH12.ORA'
        file=28 'F:\ORACLE\ORADATA\ORCL\QYSCZH15.ORA'
        error=27070 txt: 'OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 2) 系统找不到指定的文件。'
Automatic datafile offline due to write error on
file 25: F:\ORACLE\ORADATA\ORCL\QYSCZH12.ORA
Thu May 17 14:49:03 2018
KCF: read, write or open error, block=0x22b0a1 online=1
        file=28 'F:\ORACLE\ORADATA\ORCL\QYSCZH15.ORA'
        error=27070 txt: 'OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 2) 系统找不到指定的文件。'
Automatic datafile offline due to write error on
file 28: F:\ORACLE\ORADATA\ORCL\QYSCZH15.ORA
Thu May 17 14:49:03 2018
KCF: read, write or open error, block=0x138def online=1
        file=11 'F:\ORACLE\ORADATA\ORCL\QYSCZH4'
        error=27070 txt: 'OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 2) 系统找不到指定的文件。'
        file=30 'F:\ORACLE\ORADATA\ORCL\QYSCZH17.ORA'
        file=11 'F:\ORACLE\ORADATA\ORCL\QYSCZH4'
        error=27070 txt: 'OSD-04016: 异步 I/O 请求排队时出错。
        error=27070 txt: 'OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 2) 系统找不到指定的文件。'
O/S-Error: (OS 2) 系统找不到指定的文件。'
……
        file=15 'F:\ORACLE\ORADATA\ORCL\QYSCZH6.ORA'
        error=27072 txt: 'OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 21) 设备未就绪。'
Automatic datafile offline due to write error on
file 15: F:\ORACLE\ORADATA\ORCL\QYSCZH6.ORA
KCF: read, write or open error, block=0xade96 online=1
        file=9 'F:\ORACLE\ORADATA\ORCL\QYSCZH2'
        error=27072 txt: 'OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 21) 设备未就绪。'
Automatic datafile offline due to write error on
file 9: F:\ORACLE\ORADATA\ORCL\QYSCZH2
Thu May 17 14:49:28 2018
KCF: read, write or open error, block=0x378c66 online=1
        file=15 'F:\ORACLE\ORADATA\ORCL\QYSCZH6.ORA'
        error=27072 txt: 'OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 21) 设备未就绪。'
Automatic datafile offline due to write error on
file 15: F:\ORACLE\ORADATA\ORCL\QYSCZH6.ORA
KCF: read, write or open error, block=0x35f6de online=1
……

存储掉线是悲剧的起点,按理说数据库是归档模式,存储恢复之后,继续recover datafile,然后online应该问题不大,但是由于客户没有及时处理这个问题(也许业务实时性要求不高,可能挂几个小时也没人知道),导致第二个悲剧发生,删除归档的定时任务把数据库的归档日志给删除了.导致后面存储挂载上来之后,数据文件也无法正常online成功

Tue May 22 16:28:13 2018
ALTER DATABASE RECOVER  datafile 'F:\ORACLE\ORADATA\ORCL\QYSCZH'
Media Recovery Start
Serial Media Recovery started
ORA-279 signalled during: ALTER DATABASE RECOVER  datafile 'F:\ORACLE\ORADATA\ORCL\QYSCZH'  ...
Tue May 22 16:28:42 2018
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Media Recovery Log D:\ORALCE\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2018_05_22\O1_MF_1_267346_%U_.ARC
Errors with log D:\ORALCE\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2018_05_22\O1_MF_1_267346_%U_.ARC
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Media Recovery Log D:\ORALCE\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2018_05_22\O1_MF_1_267346_%U_.ARC
Errors with log D:\ORALCE\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2018_05_22\O1_MF_1_267346_%U_.ARC
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...

通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)脚本检测发现结果如下:
recover


遭遇这种情况,常规方法无法恢复,考虑使用bbed或者其他方法强制online文件,由于存储突然掉线,这样恢复的库可能后续还有大量工作需要处理,最常见的可能有表和index不一致,表的segment header信息和extent实际信息不匹配等

ORA-00700: soft internal error, arguments: [kskvmstatact: excessive swapping observed]

有一朋友数据库经常crash,让我帮忙分析和解决该问题
数据库版本

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE 12.1.0.2.0 Production
TNS for Linux: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production

alert日志报错信息

Mon Apr 23 02:14:18 2018
Process 0x0x10f33262f8 appears to be hung while dumping
Current time = 464149508, process death time = 464089392 interval = 60000
Called from location UNKNOWN:UNKNOWN
Attempting to kill process 0x0x10f33262f8 with OS pid = 30813
OSD kill succeeded for process 0x10f33262f8
Instance Critical Process (pid: 9, ospid: 30813, DBRM) died unexpectedly
Mon Apr 23 02:14:21 2018
System state dump requested by (instance=1, osid=30789 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_diag_30809_20180423021421.trc
Mon Apr 23 02:14:22 2018
PMON (ospid: 30789): terminating the instance due to error 56710
Mon Apr 23 02:14:22 2018
opiodr aborting process unknown ospid (27086) as a result of ORA-1092
Mon Apr 23 02:14:28 2018
Instance terminated by PMON, pid = 30789

而在类似报错之前,一般有swap不足的报错

Mon Apr 23 02:03:54 2018
WARNING: Heavy swapping observed on system in last 5 mins.
pct of memory swapped in [2.01%] pct of memory swapped out [0.51%].
Please make sure there is no memory pressure and the SGA and PGA
are configured correctly. Look at DBRM trace file for more details.
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_dbrm_30813.trc  (incident=854536):
ORA-00700: soft internal error, arguments: [kskvmstatact: excessive swapping observed], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_854536/orcl_dbrm_30813_i854536.trc
Mon Apr 23 02:04:02 2018
Dumping diagnostic data in directory=[cdmp_20180423020402], requested by (instance=1, osid=30813 (DBRM))

从这里报错看,由于系统内存不足,导致大量使用swap,从而引起oracle进程被kill

分析系统内存使用情况

[www.xifenfei.com@Oracle ~]$ more /proc/meminfo
MemTotal:       66109924 kB
MemFree:          359848 kB
Buffers:            9308 kB
Cached:          1848504 kB
SwapCached:       172800 kB
Active:          1060368 kB
Inactive:        1156100 kB
Active(anon):     999208 kB
Inactive(anon):  1104860 kB
Active(file):      61160 kB
Inactive(file):    51240 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      33554428 kB
SwapFree:       30516280 kB
Dirty:                68 kB
Writeback:             0 kB
AnonPages:        190936 kB
Mapped:          1152196 kB
Shmem:           1745380 kB
Slab:              70900 kB
SReclaimable:      25640 kB
SUnreclaim:        45260 kB
KernelStack:        5728 kB
PageTables:        92488 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    35152108 kB
Committed_AS:   70923356 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      267468 kB
VmallocChunk:   34359442996 kB
HardwareCorrupted:     0 kB
AnonHugePages:     18432 kB
HugePages_Total:   30720
HugePages_Free:    30720
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        8192 kB
DirectMap2M:     2088960 kB
DirectMap1G:    65011712 kB
[www.xifenfei.com@Oracle ~]$ free -m
             total       used       free     shared    buffers     cached
Mem:         64560      64200        359       1682          9       1801
-/+ buffers/cache:      62389       2170
Swap:        32767       2977      29790

比较明显系统总共内存64G,配置了60G大页,但是数据库没有使用该大页

数据库使用内存情况

SQL> show sga;
Total System Global Area 7.1672E+10 bytes
Fixed Size                  3719544 bytes
Variable Size            2684358280 bytes
Database Buffers         6.8719E+10 bytes
Redo Buffers              264712192 bytes

比较明显按照上述配置,一共就只有4G的空闲内存,但是oracle sga占用7G,出现大量换页是必然.错误也明显想让数据库使用大页,但是由于配置不当导致数据库无法使用大页而使用系统除大页之外的内存,从而引起系统异常.
这里也说明12c的提示有明显的改善,通过alert的错误提示基本上就可以确定是swap不足导致.

ORA-600 17182导致oracle异常

正常运行的数据库突然爆ORA-00600 17182,然后直接crash,以前遇到过类似的case:分享一例由于主库逻辑坏块导致dataguard容灾失效,这又是一例数据库正常crash之后无法启动成功的case

Tue May 22 08:32:12 2018
Archived Log entry 84344 added for thread 1 sequence 90196 ID 0x103430df dest 1:
Tue May 22 09:05:42 2018
Thread 1 cannot allocate new log, sequence 90198
Private strand flush not complete
  Current log# 4 seq# 90197 mem# 0: +DATA/xifenfei/onlinelog/group_4.279.887464919
Thread 1 advanced to log sequence 90198 (LGWR switch)
  Current log# 2 seq# 90198 mem# 0: +DATA/xifenfei/onlinelog/group_2.263.887465041
Tue May 22 09:05:46 2018
Archived Log entry 84345 added for thread 1 sequence 90197 ID 0x103430df dest 1:
Tue May 22 09:07:42 2018
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_16297.trc  (incident=592822):
ORA-00600: 内部错误代码, 参数: [17182], [0x7FE274EADBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_592822/xifenfei_ora_16297_i592822.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 22 09:07:45 2018
Dumping diagnostic data in directory=[cdmp_20180522090745], requested by (instance=1, osid=16297), summary=[incident=592822].
Tue May 22 09:07:46 2018
Sweep [inc][592822]: completed
Sweep [inc2][592822]: completed
Tue May 22 09:08:29 2018
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_16297.trc  (incident=592824):
ORA-07445: 出现异常错误: 核心转储 [kghrcdepth()+168] [SIGSEGV] [ADDR:0x7FE2766ADD04] [PC:0x2C2B886] [Invalid permissions for mapped object] []
ORA-00600: 内部错误代码, 参数: [kghrcdepth:ds], [0x7FE274EADBE8], [], [], [], [], [], [], [], [], [], []
ORA-00600: 内部错误代码, 参数: [17182], [0x7FE274EADBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_592824/xifenfei_ora_16297_i592824.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 22 09:08:31 2018
Dumping diagnostic data in directory=[cdmp_20180522090831], requested by (instance=1, osid=16297), summary=[incident=592823].
Tue May 22 09:08:44 2018
Block recovery from logseq 90198, block 37639 to scn 161030804187
Recovery of Online Redo Log: Thread 1 Group 2 Seq 90198 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_2.263.887465041
Block recovery completed at rba 90198.97219.16, scn 37.2117014236
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_pmon_7690.trc  (incident=592118):
ORA-00600: internal error code, arguments: [17182], [0x7F96BDA2AA70], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_592118/xifenfei_pmon_7690_i592118.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x97E6B5C, kghpmfal()+216] [flags: 0x0, count: 1]
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_pmon_7690.trc  (incident=592119):
ORA-07445: exception encountered: core dump [kghpmfal()+216] [SIGSEGV] [ADDR:0x0] [PC:0x97E6B5C] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x7F96BDA2AA70], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_592119/xifenfei_pmon_7690_i592119.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 22 09:08:45 2018
Dumping diagnostic data in directory=[cdmp_20180522090845], requested by (instance=1, osid=7690 (PMON)), summary=[incident=592118].
Tue May 22 09:08:47 2018
Sweep [inc][592824]: completed
Sweep [inc][592823]: completed
Sweep [inc][592119]: completed
Sweep [inc][592118]: completed
Sweep [inc2][592824]: completed
Sweep [inc2][592119]: completed
Sweep [inc2][592118]: completed
Tue May 22 09:08:47 2018
ARC2 (ospid: 7834): terminating the instance due to error 472
Instance terminated by ARC2, pid = 7834

无法正常open

Completed: ALTER DATABASE   MOUNT
Tue May 22 09:26:44 2018
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Started redo scan
Completed redo scan
 read 12232 KB redo, 4787 data blocks need recovery
Started redo application at
 Thread 1: logseq 90199, block 233846
Recovery of Online Redo Log: Thread 1 Group 3 Seq 90199 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_3.262.887465049
Completed redo application of 10.34MB
Completed crash recovery at
 Thread 1: logseq 90199, block 258311, scn 161030851622
 4787 data blocks read, 4787 data blocks written, 12232 redo k-bytes read
LGWR: STARTING ARCH PROCESSES
Tue May 22 09:26:45 2018
ARC0 started with pid=48, OS id=18632
Tue May 22 09:26:46 2018
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Thread 1 advanced to log sequence 90200 (thread open)
Tue May 22 09:26:46 2018
ARC1 started with pid=49, OS id=18636
Tue May 22 09:26:46 2018
ARC2 started with pid=50, OS id=18640
Tue May 22 09:26:46 2018
ARC3 started with pid=51, OS id=18644
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Thread 1 opened at log sequence 90200
  Current log# 5 seq# 90200 mem# 0: +DATA/xifenfei/onlinelog/group_5.280.887465135
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue May 22 09:26:46 2018
SMON: enabling cache recovery
[18512] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:2704839736 end:2704839986 diff:250 (2 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
Archived Log entry 84347 added for thread 1 sequence 90199 ID 0x103430df dest 1:
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Tue May 22 09:26:47 2018
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p019_18664.trc  (incident=624628):
ORA-00600: internal error code, arguments: [17182], [0x7F7A4A50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_624628/xifenfei_p019_18664_i624628.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Starting background process QMNC
Tue May 22 09:26:48 2018
QMNC started with pid=71, OS id=18737
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x97ECF6C, kghalo()+570] [flags: 0x0, count: 1]
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p019_18664.trc  (incident=624629):
ORA-00600: internal error code, arguments: [17182], [0x7F7A4A50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_624629/xifenfei_p019_18664_i624629.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 22 09:26:49 2018
Tue May 22 09:26:50 2018
Starting background process EMNC
Tue May 22 09:26:50 2018
EMNC started with pid=76, OS id=18814
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x97ECF6C, kghalo()+570] [flags: 0x0, count: 2]
Completed: ALTER DATABASE OPEN
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p019_18664.trc  (incident=624630):
ORA-07445: exception encountered: core dump [kghalo()+570] [SIGSEGV] [ADDR:0x0] [PC:0x97ECF6C] [SI_KERNEL(general_protection)] []
ORA-07445: exception encountered: core dump [kghalo()+570] [SIGSEGV] [ADDR:0x0] [PC:0x97ECF6C] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x7F7A4A50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_624630/xifenfei_p019_18664_i624630.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 22 09:27:01 2018
Block recovery from logseq 90200, block 59 to scn 161030851961
Recovery of Online Redo Log: Thread 1 Group 5 Seq 90200 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_5.280.887465135
Block recovery stopped at EOT rba 90200.935.16
Block recovery completed at rba 90200.935.16, scn 37.2117062009
Starting background process CJQ0
Tue May 22 09:27:01 2018
SMON: slave died unexpectedly, downgrading to serial recovery
Tue May 22 09:27:01 2018
CJQ0 started with pid=56, OS id=18922
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x9823AA3, kghalo()+567] [flags: 0x0, count: 1]
Errors in file /u01/app/diag/rdbms/posdg/posdg/trace/posdg_p019_11656.trc  (incident=1136658):
ORA-07445: exception encountered: core dump [kghalo()+567] [SIGSEGV] [ADDR:0x0] [PC:0x9823AA3] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x7F813F61DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/posdg/posdg/incident/incdir_1136658/posdg_p019_11656_i1136658.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/diag/rdbms/posdg/posdg/trace/posdg_ora_10925.trc:
ORA-00600: internal error code, arguments: [2252], [49410], [2147581953], [3726], [1009467392], [], [], [], [], [], [], []
Errors in file /u01/app/diag/rdbms/posdg/posdg/trace/posdg_ora_10925.trc:
ORA-00600: internal error code, arguments: [2252], [49410], [2147581953], [3726], [1009467392], [], [], [], [], [], [], []
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_18434.trc  (incident=624292):
ORA-00600: internal error code, arguments: [17182], [0x7F7488BDD7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_624292/xifenfei_smon_18434_i624292.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x97E64D7, kghalf()+537] [flags: 0x0, count: 1]
Errors in file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_18434.trc  (incident=624293):
ORA-07445: exception encountered: core dump [kghalf()+537] [SIGSEGV] [ADDR:0x0] [PC:0x97E64D7] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x7F7488BDD7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/diag/rdbms/xifenfei/xifenfei/incident/incdir_624293/xifenfei_smon_18434_i624293.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 22 09:27:03 2018
Dumping diagnostic data in directory=[cdmp_20180522092703], requested by (instance=1, osid=18434 (SMON)), summary=[incident=624292].
PMON (ospid: 18383): terminating the instance due to error 474
Tue May 22 09:27:05 2018
opiodr aborting process unknown ospid (18839) as a result of ORA-1092
System state dump requested by (instance=1, osid=18383 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_diag_18402_20180522092705.trc
Tue May 22 09:27:05 2018
ORA-1092 : opitsk aborting process
Instance terminated by PMON, pid = 18383

通过对于启动过程的观察,比较明显,由于数据库无法正常回滚,导致smon进程异常,从而使得数据库无法启动成功.恢复方法比较简单,就是对异常事务进行提交或者跳过即可

truncate IDL_UB1$恢复

世界之大无奇不有,已经记不清这是第几个客户咨询IDL_UB1$ 被truncate之后导致数据库无法启动的case了.idl_ub1$表是用来存储PL/SQL的代码单元的,包括DIANA等,IDL在这里代表Interface Definition Language. 在数据库的启动过程中通过10046跟踪可以知道,有类似:select /*+ index(idl_ub1$ i_idl_ub11) +*/ piece#,length,piece from idl_ub1$ where obj#=:1 and part=:2 and version=:3 order by piece#的查询语句,由于该表被truncate之后,导致数据库启动无法绕过该sql,而hang住无法完全open成功.下午闲着没事通过模拟,可以对该故障实现正常open,并且导出数据
模拟业务数据

create user xff identified by oracle;
grant dba to xff;
conn xff/oracle
create table t_xifenfei as select * from dba_objects;
create index i_xifenfei on t_xifenfei(object_id);
create view v_xifenfei as select * from t_xifenfei;
create or replace procedure proc1(
para1 varchar2,
para2 out varchar2,
para3 in out varchar2
) as
v_name varchar2(20);
begin
 v_name :='xifenfei';
 para3 := v_name;
dbms_output.put_line('para3:'||para3);
end;
/
alter system checkpoint;

创建xff账户,并且创建有代表性的表,索引,存储过程,视图等.

模拟truncate IDL_UB1$表

SQL> conn / as sysdba
Connected.
SQL> truncate table IDL_UB1$;
truncate table IDL_UB1$
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 31325
Session ID: 177 Serial number: 7

重启数据库

SQL> conn / as sysdba
Connected.
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>
SQL>
SQL>
SQL>
SQL> startup
ORACLE instance started.
Total System Global Area 7499329536 bytes
Fixed Size                  2267832 bytes
Variable Size            1409287496 bytes
Database Buffers         6073352192 bytes
Redo Buffers               14422016 bytes
Database mounted.

数据库在mount之后,一直处于hang住状态,查看alert日志

Sun May 20 17:02:34 2018
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x18] [PC:0x98D94A7, hshuid()+273] [flags: 0x0, count: 1]
Errors in file /home/u01/app/oracle/diag/rdbms/test/test/trace/test_ora_31325.trc  (incident=21781):
ORA-07445: exception encountered: core dump [hshuid()+273] [SIGSEGV] [ADDR:0x18] [PC:0x98D94A7] [Address not mapped to object] []
Incident details in: /home/u01/app/oracle/diag/rdbms/test/test/incident/incdir_21781/test_ora_31325_i21781.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sun May 20 17:02:37 2018
Sweep [inc][21781]: completed
Sweep [inc2][21781]: completed
Sun May 20 17:02:37 2018
Dumping diagnostic data in directory=[cdmp_20180520170237], requested by (instance=1, osid=31325), summary=[incident=21781].
Sun May 20 17:02:55 2018
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x18] [PC:0x98D94A7, hshuid()+273] [flags: 0x0, count: 1]
Errors in file /home/u01/app/oracle/diag/rdbms/test/test/trace/test_m000_31373.trc  (incident=21821):
ORA-07445: exception encountered: core dump [hshuid()+273] [SIGSEGV] [ADDR:0x18] [PC:0x98D94A7] [Address not mapped to object] []
Incident details in: /home/u01/app/oracle/diag/rdbms/test/test/incident/incdir_21821/test_m000_31373_i21821.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sun May 20 17:02:56 2018
Dumping diagnostic data in directory=[cdmp_20180520170256], requested by (instance=1, osid=31373 (M000)), summary=[incident=21821].

这类问题比较明显,正常方法无法打开,通过工具分析system文件,发现虽然truncate IDL_UB1$操作报错了,但是IDL_UB1$和对应的index I_IDL_UB11 obj#,dataobj#均已经改变,而且相关对象的segment header也变化为新dataobj#(truncate之后的),也就是说truncate在数据库中的主要更改操作已经完成.现在在缺少记录的情况下,数据库执行如下sql无法获取记录,从而无法open

PARSING IN CURSOR #140342551421712 len=132 dep=2 uid=0 oct=3 lid=0 tim=1526843464635335 hv=4260389146 ad='21af73218' sqlid='cvn54b7yz0s8u'
select /*+ index(idl_ub1$ i_idl_ub11) +*/ piece#,length,piece from idl_ub1$ where obj#=:1 and part=:2 and version=:3 order by piece#
END OF STMT
PARSE #140342551421712:c=0,e=14,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,plh=3246118364,tim=1526843464635335
BINDS #140342551421712:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fa40bec7d80  bln=22  avl=03  flg=05
  value=1310
 Bind#1
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fa40bec7d50  bln=24  avl=01  flg=05
  value=0
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fa40bec7d20  bln=24  avl=06  flg=05
  value=184549376
EXEC #140342551421712:c=0,e=76,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,plh=3246118364,tim=1526843464635449
FETCH #140342551421712:c=0,e=4,p=0,cr=1,cu=0,mis=0,r=0,dep=2,og=4,plh=3246118364,tim=1526843464635462
STAT #140342551421712 id=1 cnt=0 pid=0 pos=1 obj=225 op='TABLE ACCESS BY INDEX ROWID IDL_UB1$ (cr=1 pr=0 pw=0 time=4 us cost=3 size=44 card=2)'
STAT #140342551421712 id=2 cnt=0 pid=1 pos=1 obj=236 op='INDEX RANGE SCAN I_IDL_UB11 (cr=1 pr=0 pw=0 time=4 us cost=2 size=0 card=2)'
CLOSE #140342551421712:c=0,e=2,dep=2,type=0,tim=1526843464635496

通过对数据库采用技术欺骗手段,让数据库启动相关sql能够获取到记录(和正常查询的相同),从而实现数据库正常open

SQL> startup mount
ORACLE instance started.
Total System Global Area 7499329536 bytes
Fixed Size                  2267832 bytes
Variable Size            1409287496 bytes
Database Buffers         6073352192 bytes
Redo Buffers               14422016 bytes
Database mounted.
SQL> alter database open;
Database altered.

open成功之后,后台报大量的ORA-08103: object no longer exists,通过分析是由于truncate IDL_UB1$没有完全成功,导致出现该错误.解决方法就是对Oracle数据字典进行人工更新,把没有完成的truncate操作在数据库中给予完成.
导出数据
exp导出数据成功
exp-data


但是expdp无法执行

[oracle@bogon oradata]$ expdp '"/ as sysdba"' schemas=xff file=1.dmp
Export: Release 11.2.0.4.0 - Production on Sun May 20 17:10:53 2018
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
UDE-31623: operation generated ORACLE error 31623
ORA-31623: a job is not attached to this session via the specified handle
ORA-06512: at "SYS.DBMS_DATAPUMP", line 3326
ORA-06512: at "SYS.DBMS_DATAPUMP", line 4551
ORA-06512: at line 1

暂时未去研究对这个表进行重建,使用exp导出然后再imp导入是比较理想的办法

恶意删除bootstrap$导致数据库无法正常启动

有客户10.2.0.5的数据库关闭之后,无法正常启动报ORA-00704 ORA-00702错误.

Fri May 18 22:42:26  2018
ALTER DATABASE OPEN
Fri May 18 22:42:27  2018
Beginning crash recovery of 1 threads
 parallel recovery started with 7 processes
Fri May 18 22:42:27  2018
Started redo scan
Fri May 18 22:42:27  2018
Completed redo scan
 1 redo blocks read, 0 data blocks need recovery
Fri May 18 22:42:27  2018
Started redo application at
 Thread 1: logseq 2, block 2, scn 8448162573
Fri May 18 22:42:27  2018
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: D:\DATABASE\xifenfei\REDO02.LOG
Fri May 18 22:42:27  2018
Completed redo application
Fri May 18 22:42:27  2018
Completed crash recovery at
 Thread 1: logseq 2, block 3, scn 8448182575
 0 data blocks read, 0 data blocks written, 1 redo blocks read
Fri May 18 22:42:28  2018
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=23, OS id=3188
ARC1 started with pid=24, OS id=3168
ARC2 started with pid=25, OS id=996
ARC3 started with pid=26, OS id=432
ARC4 started with pid=27, OS id=3728
Fri May 18 22:42:28  2018
ARC0: Archival started
ARC1: Archival started
ARC5 started with pid=28, OS id=2876
Fri May 18 22:42:28  2018
ARC2: Archival started
ARC3: Archival started
ARC4: Archival started
ARC5: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
Fri May 18 22:42:28  2018
Thread 1 advanced to log sequence 3 (thread open)
Thread 1 opened at log sequence 3
  Current log# 3 seq# 3 mem# 0: D:\DATABASE\xifenfei\REDO03.LOG
Successful open of redo thread 1
Fri May 18 22:42:28  2018
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri May 18 22:42:28  2018
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Fri May 18 22:42:28  2018
ARC2: Becoming the heartbeat ARCH
Fri May 18 22:42:28  2018
SMON: enabling cache recovery
Fri May 18 22:42:28  2018
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\udump\xifenfei_ora_3148.trc:
ORA-00704: 引导程序进程失败
ORA-00702: 引导程序版本 '' 与版本 '8.0.0.0.0' 不一致
Fri May 18 22:42:28  2018
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 3148
ORA-1092 signalled during: ALTER DATABASE OPEN...

根据以前恢复经验ORA-00702: bootstrap verison ” inconsistent with version ’8.0.0.0.0′,很可能是由于bootstrap$表异常了.
通过dbv检查system文件确认没有坏块
dbv-system


通过bbed分析,确认记录被删除
把数据文件拷贝到本地,通过bbed进行分析,确认记录丢失

BBED> map
 File: d:/system01.dbf (0)
 Block: 379                                   Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 48 bytes                     @20
 struct kdbh, 14 bytes                      @68
 struct kdbt[1], 4 bytes                    @82
 sb2 kdbr[24]                               @86
 ub1 freespace[1158]                        @134
 ub1 rowdata[6896]                          @1292
 ub4 tailchk                                @8188
BBED> p *kdbr[0]
rowdata[6875]
-------------
ub1 rowdata[6875]                           @8167     0x3c
BBED> x /rnnc
rowdata[6875]                               @8167
-------------
flag@8167: 0x3c (KDRHFL, KDRHFF, KDRHFD, KDRHFH)
lock@8168: 0x01
cols@8169:    0

故障原因跟踪
有人在数据库中注入了恶意脚本,导致数据库删除了bootstrap$中数据,关闭之后无法正常启动
delete-bootstrap$


处理方法
通过oracle bbed 修复数据字典,正常启动数据库

使用_unnest_subquery优化sql

一个复杂的sql查询,使用了大量EXISTS和NOT EXISTS 关联导致sql执行效率低下,这里挑选出来最核心的部分进行演示

SQL> explain plan for   select
  2   a.aab034, a.aac001
  3    from si_dp.ac01_ac02 a
  4   where exists (select 1
  5            from ic40
  6           where aac001 = a.aac001
  7             and aae045 <= '201803'
  8             and aae120 = '0')
  9     and not exists (select 1
 10            from ic15
 11           where aac001 = a.aac001
 12             and aae002 <= '201803')
 13     and not EXISTS (select aab001
 14            from ab01
 15           where aab019 in ('91', '93')
 16             AND aab001 = a.aab001)
 17    and exists (select 1
 18            from ac13
 19           where aac001 = a.aac001
 20             and aae140 = '11'
 21             and aae114 in ('0', '1')
 22             and aae002 <= '201803')
 23     AND EXISTS (SELECT 1
 24            FROM AC13
 25           WHERE AAC001 = A.AAC001
 26             and aae140 = '11'
 27             AND AAE143 = '02'
 28             AND AAE003 < '201707'
 29             AND AAE002 BETWEEN '201801' AND '201803'
 30             and aae114 = '1')
 31     AND not EXISTS (SELECT 1
 32         FROM AC13
 33           WHERE AAC001 = A.AAC001
 34             and aae140 = '11'
 35          AND AAE002 < '201801')
 36     AND not EXISTS (SELECT 1
 37            FROM ac02
 38           WHERE AAC001 = A.AAC001
 39             and aae140 = '11'
 40             AND AAE036 < date '2018-1-1');
Explained.
Elapsed: 00:00:00.36
SQL> select * from table (dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name               | Rows  | Bytes |TempSpc| Cost (%CPU)|
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                    |     1 |   202 |       | 11172   (2)|
|   1 |  NESTED LOOPS SEMI                |                    |     1 |   202 |       | 11172   (2)|
|   2 |   NESTED LOOPS ANTI               |                    |     1 |   175 |       | 11168   (2)|
|   3 |    NESTED LOOPS SEMI              |                    |     1 |   150 |       | 11164   (2)|
|   4 |     NESTED LOOPS ANTI             |                    |     1 |   126 |       | 11160   (2)|
|   5 |      NESTED LOOPS SEMI            |                    |     1 |   104 |       | 11158   (2)|
|   6 |       NESTED LOOPS ANTI           |                    |     1 |    67 |       | 11145   (2)|
|   7 |        HASH JOIN ANTI             |                    |     1 |    50 |  8640K| 11143   (2)|
|   8 |         TABLE ACCESS FULL         | AC01_AC02          |   245K|  5755K|       |   356   (2)|
|   9 |         TABLE ACCESS FULL         | AC02               |   559K|    13M|       |  9346   (2)|
|  10 |        TABLE ACCESS BY INDEX ROWID| AB01               |     2 |    34 |       |     2   (0)|
|  11 |         INDEX UNIQUE SCAN         | PK_AB01            |     1 |       |       |     1   (0)|
|  12 |       TABLE ACCESS BY INDEX ROWID | AC13               |   325K|    11M|       |    13   (0)|
|  13 |        INDEX RANGE SCAN           | I_AC13_AAE143      |   446 |       |       |     4   (0)|
|  14 |      INDEX RANGE SCAN             | PK_IC15            |  1771K|    37M|       |     2   (0)|
|  15 |     TABLE ACCESS BY INDEX ROWID   | IC40               |    17M|   395M|       |     4   (0)|
|  16 |      INDEX RANGE SCAN             | PK_IC40            |     1 |       |       |     3   (0)|
|  17 |    TABLE ACCESS BY INDEX ROWID    | AC13               |    51M|  1236M|       |     4   (0)|
|  18 |     INDEX RANGE SCAN              | RELATION_233112_FK |     3 |       |       |     3   (0)|
|  19 |   TABLE ACCESS BY INDEX ROWID     | AC13               |    52M|  1350M|       |     4   (0)|
|  20 |    INDEX RANGE SCAN               | RELATION_233112_FK |     3 |       |       |     3   (0)|
-----------------------------------------------------------------------------------------------------

这条sql,在一个10.2.0.3的系统中执行了十几个小时无法出结果,开发商反馈,该大部分客户的11.2的环境中,大概十几分钟出结果.从来没有遇到此类情况.让我们给他优化sql.看到这个sql,第一反应就是很可能大量的NESTED LOOPS效率低下,怀疑统计信息错误,结果收集完统计信息之后,执行计划依旧,我就在思考怎么调整sql,让其不这样大量嵌套执行.想起来的_unnest_subquery是控制子查询嵌套转换的,从9i开始默认为true,尝试设置为false测试.

SQL> alter session set "_unnest_subquery"=false;
Session altered.
Elapsed: 00:00:00.00
SQL> explain plan for   select
  2   a.aab034, a.aac001
  3    from si_dp.ac01_ac02 a
  4   where exists (select 1
  5            from ic40
  6           where aac001 = a.aac001
  7             and aae045 <= '201803'
  8             and aae120 = '0')
  9     and not exists (select 1
 10            from ic15
 11           where aac001 = a.aac001
 12             and aae002 <= '201803')
 13     and not EXISTS (select aab001
 14            from ab01
 15           where aab019 in ('91', '93')
 16             AND aab001 = a.aab001)
 17    and exists (select 1
 18            from ac13
 19          where aac001 = a.aac001
 20             and aae140 = '11'
 21             and aae114 in ('0', '1')
 22             and aae002 <= '201803')
 23     AND EXISTS (SELECT 1
 24            FROM AC13
 25           WHERE AAC001 = A.AAC001
 26             and aae140 = '11'
 27             AND AAE143 = '02'
 28             AND AAE003 < '201707'
 29             AND AAE002 BETWEEN '201801' AND '201803'
 30             and aae114 = '1')
 31     AND not EXISTS (SELECT 1
 32            FROM AC13
 33           WHERE AAC001 = A.AAC001
 34             and aae140 = '11'
 35             AND AAE002 < '201801')
 36     AND not EXISTS (SELECT 1
 37            FROM ac02
 38           WHERE AAC001 = A.AAC001
 39             and aae140 = '11'
 40             AND AAE036 < date '2018-1-1');
Explained.
Elapsed: 00:00:00.07
SQL> select * from table (dbms_xplan.display);
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name             | Rows  | Bytes |TempSpc| Cost (%CPU)|
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                  |   185K|    19M|       |  2991K  (2)|
|   1 |  FILTER                       |                  |       |       |       |            |
|   2 |   HASH JOIN RIGHT SEMI        |                  |   185K|    19M|    16M|   758K  (3)|
|   3 |    TABLE ACCESS BY INDEX ROWID| AC13             |   353K|    12M|       |  4556   (1)|
|   4 |     INDEX SKIP SCAN           | I_AC13_AAB001    | 23608 |       |       |  2287   (1)|
|   5 |    HASH JOIN SEMI             |                  |   201K|    14M|    11M|   751K  (3)|
|   6 |     HASH JOIN SEMI            |                  |   201K|  9452K|  8640K|   123K  (3)|
|   7 |      TABLE ACCESS FULL        | AC01_AC02        |   245K|  5755K|       |   357   (2)|
|   8 |      TABLE ACCESS FULL        | IC40             |    21M|   481M|       | 86122   (3)|
|   9 |     TABLE ACCESS FULL         | AC13             |    52M|  1350M|       |   530K  (3)|
|  10 |   INDEX RANGE SCAN            | PK_IC15          |     2 |    44 |       |     3   (0)|
|  11 |   VIEW                        | index$_join$_009 |     1 |    17 |       |     3  (34)|
|  12 |    HASH JOIN                  |                  |       |       |       |            |
|  13 |     INDEX RANGE SCAN          | PK_AB01          |     1 |    17 |       |     2   (0)|
|  14 |     INLIST ITERATOR           |                  |       |       |       |            |
|  15 |      INDEX RANGE SCAN         | IDX_AB01_AAB019  |     1 |    17 |       |     8   (0)|
|  16 |   TABLE ACCESS BY INDEX ROWID | AC13             |     2 |    50 |       |     5   (0)|
|  17 |    INDEX RANGE SCAN           | I_AC13_SEARCH    |   152 |       |       |     4   (0)|
|  18 |   TABLE ACCESS BY INDEX ROWID | AC02             |     1 |    26 |       |     4   (0)|
|  19 |    INDEX RANGE SCAN           | PK_AC02          |     1 |       |       |     3   (0)|
-----------------------------------------------------------------------------------------------

让开发设置该参数,然后执行sql,结果3分钟不到出结果,非常圆满完成任务.该sql还有进一步优化空间,但是考虑到已经满足要求,不再折腾.