完美解决dul处理clob字段乱码问题

使用过dul的人都知道,dul在处理clob字段的时候,经常性出现乱码,而且官方没有提供好的处理方法,通过不断的测试折腾终于在2013年春节解决了2012年困惑在我心中的疑惑,不再因为dul不可以处理clob而觉得odu/aul更加高级。这个算是2013年给自己的第一份ORACLE数据库恢复方面大礼包.
在dul中,虽然提供了LDR_OUTPUT_IN_UTF8选项,让所有的clob变为UTF8,但是在实际测试中,没有成功.这里对于clob字段处理结果对比:
配置init.dul参数

osd_big_endian_flag=false
osd_dba_file_bits=10
osd_c_struct_alignment=32
osd_file_leader_size=1
osd_word_size = 32
feedback = 1000
dc_columns=2000000
dc_tables=10000
dc_objects=1000000
dc_users=400
dc_segments=100000
USE_LOB_FILES =TRUE
control_file = control.dul
LDR_ENCLOSE_CHAR=|
db_block_size=8192
export_mode=FALSE
compatible=11

正常情况下测试clob字段

[oracle@xifenfei dul]$ ./dul
Data UnLoader: 10.2.0.5.20 - Internal Only - on Sat Jan 19 00:19:05 2013
with 64-bit io functions
Copyright (c) 1994 2013 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
DUL> unload table chf.EVT_T_COMMON_LOG;
. unloading (index organized) table     LOB024001aa
DUL: Warning: Recreating file "LOB024001aa.ctl"
       0 rows unloaded
. unloading (index organized) table     LOB024001da
DUL: Warning: Recreating file "LOB024001da.ctl"
       0 rows unloaded
. unloading (index organized) table     LOB0240020a
DUL: Warning: Recreating file "LOB0240020a.ctl"
       0 rows unloaded
. unloading (index organized) table     LOB0240023a
DUL: Warning: Recreating file "LOB0240023a.ctl"
       0 rows unloaded
. unloading (index organized) table     LOB0240026a
DUL: Warning: Recreating file "LOB0240026a.ctl"
       0 rows unloaded
Preparing lob metadata from lob index
Reading LOB024001aa.dat 0 entries loaded and sorted 0 entries
Preparing lob metadata from lob index
Reading LOB024001da.dat 0 entries loaded and sorted 0 entries
Preparing lob metadata from lob index
Reading LOB0240020a.dat 0 entries loaded and sorted 0 entries
Preparing lob metadata from lob index
Reading LOB0240023a.dat 0 entries loaded and sorted 0 entries
Preparing lob metadata from lob index
Reading LOB0240026a.dat 0 entries loaded and sorted 0 entries
. unloading table          EVT_T_COMMON_LOG
DUL: Warning: Recreating file "CHF_EVT_T_COMMON_LOG.ctl"
    1863 rows unloaded
--修改CHF_EVT_T_COMMON_LOG.ctl中的导入表名为TEST.T_TEST
--创建测试表
SQL> create table t_test as
  2   select * from chf.EVT_T_COMMON_LOG where 1=0;
Table created.
--导入数据
[oracle@xifenfei dul]$ sqlldr test/test control=CHF_EVT_T_COMMON_LOG.ctl
SQL*Loader: Release 11.2.0.3.0 - Production on Fri Jan 18 23:50:32 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Commit point reached - logical record count 64
Commit point reached - logical record count 128
…………
Commit point reached - logical record count 1856
Commit point reached - logical record count 1863
--测试数据是否乱码
SQL> desc chf.EVT_T_COMMON_LOG
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                        NOT NULL VARCHAR2(64)
 CASES_OF_STATISTICAL_SITATION                      CLOB
…………
 DEPARTMENT_ID                                      VARCHAR2(40)
 MOTIFY_MAN_ID                                      VARCHAR2(40)
SQL>  select CASES_OF_STATISTICAL_SITATION from t_test where rownum<3;
CASES_OF_STATISTICAL_SITATION
--------------------------------------------------------------------------------
b*kb
     _SfZz

通过试验证明,在dul处理clob字段的时候,很容易出现乱码,因为这里涉及到很多中情况(NLS_LANG,LANG,LDR_OUTPUT_IN_UTF8参数等),经过了多次试验,均不能成功,这里就是为了给出来一个大概的结论:dul在正常情况下不能完美的处理非英文的clob

修改后lob字段文件属性后测试clob字段

--修改CHF_EVT_T_COMMON_LOG.ctl中的导入表名为TEST.EVT_T_COMMON_LOG
--导入数据
[oracle@xifenfei dul]$ sqlldr test/test control=CHF_EVT_T_COMMON_LOG.ctl
SQL*Loader: Release 11.2.0.3.0 - Production on Fri Jan 18 23:50:32 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Commit point reached - logical record count 64
Commit point reached - logical record count 128
…………
Commit point reached - logical record count 1856
Commit point reached - logical record count 1863
--测试数据是否乱码
SQL> select CASES_OF_STATISTICAL_SITATION from EVT_T_COMMON_LOG where rownum<3;
CASES_OF_STATISTICAL_SITATION
--------------------------------------------------------------------------------
1、案件统计情况截止至交班时间C时间 0 分),今日立难点问题C7,国庆北路桂门岭社区丽都花园路口,多次上

测试证明:修改了clob文件的相关属性后,完美实现dul处理clob乱码问题

dul恢复drop表测试

dul对被drop对象进行恢复,需要提供两个信息
1.被删除表所属表空间(非必须)
2.被删除表结构(必须)
模拟删除表

--创建测试表
SQL> create table t_dul_drop tablespace czum
  2  as
  3  select * from dba_tables;
Table created.
--备份被删除表数据,便于比较和提供测试表结构
SQL> create table t_dul_drop_bak tablespace users
  2  as select * from t_dul_drop;
Table created.
SQL> alter system switch logfile;
System altered.
SQL> select count(*) from t_dul_drop;
  COUNT(*)
----------
      1785
SQL> drop table chf.t_dul_drop purge;
Table dropped.
SQL> alter system checkpoint;
System altered.

使用logminer找到data_object_id

delete from "SYS"."OBJ$" where "OBJ#" = '68474' and "DATAOBJ#" = '68474'
and "OWNER#" = '61' and "NAME" = 'T_DUL_DROP' and "NAMESPACE" = '1' and
"SUBNAME" IS NULL and "TYPE#" = '2' and "CTIME" = TO_DATE('04-FEB-13', 'DD-MON-RR')
and "MTIME" = TO_DATE('04-FEB-13', 'DD-MON-RR') and "STIME" = TO_DATE('04-FEB-13', 'DD-MON-RR')
and "STATUS" = '1' and "REMOTEOWNER" IS NULL and "LINKNAME" IS NULL and "FLAGS" = '0'
and "OID$" IS NULL and "SPARE1" = '6' and "SPARE2" = '1' and "SPARE3" = '61' and
"SPARE4" IS NULL and "SPARE5" IS NULL and "SPARE6" IS NULL and ROWID = 'AAAAASAABAAAPzCAAV';

这里可以知道,被删除表的data_object_id为68474

DUL恢复被删除表

--dul版本
E:\dul10>dul.exe
Data UnLoader 10.2.4.37 - Oracle Internal Only - on Mon Feb 04 23:49:50 2013
with 64-bit io functions
Copyright (c) 1994 2010 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal use Only
DUL>  ALTER SESSION SET USE_SCANNED_EXTENT_MAP = TRUE;
Parameter altered
--扫描所属表空间
DUL> scan tablespace 6;
Scanning tablespace 6, data file 6 ...
  13 segment header and 331 data blocks
  tablespace 6, data file 6: 1279 blocks scanned
Reading EXT.dat 13 entries loaded and sorted 13 entries
Reading SEG.dat 13 entries loaded
Reading COMPATSEG.dat 0 entries loaded
Reading SCANNEDLOBPAGE.dat 0 entries loaded and sorted 0 entries
--scan tables得到需求表(可以核对数据样例)
DUL> scan tables;
UNLOAD TABLE OBJNO68474 ( COL001 VARCHAR2(11), COL002 VARCHAR2(30), COL003 VARCHAR2(6)
        , COL004 VARCHAR2(20), COL005 VARCHAR2(30), COL006 VARCHAR2(5), COL007 NUMBER
        , COL008 NUMBER, COL009 NUMBER, COL010 NUMBER, COL011 NUMBER
        , COL012 NUMBER, COL013 NUMBER, COL014 NUMBER, COL015 CHAR
        , COL016 NUMBER, COL017 NUMBER, COL018 VARCHAR2(3), COL019 VARCHAR2(1)
        , COL020 NUMBER, COL021 NUMBER, COL022 NUMBER, COL023 NUMBER
        , COL024 NUMBER, COL025 NUMBER, COL026 NUMBER, COL027 NUMBER
        , COL028 VARCHAR2(10), COL029 VARCHAR2(10), COL030 VARCHAR2(5), COL031 VARCHAR2(7)
        , COL032 NUMBER, COL033 DATE, COL034 VARCHAR2(3), COL035 VARCHAR2(12)
        , COL036 VARCHAR2(1), COL037 VARCHAR2(1), COL038 VARCHAR2(3), COL039 VARCHAR2(7)
        , COL040 VARCHAR2(7), COL041 VARCHAR2(7), COL042 VARCHAR2(8), COL043 VARCHAR2(3)
        , COL044 VARCHAR2(2), COL045 VARCHAR2(15), COL046 VARCHAR2(8), COL047 VARCHAR2(3)
        , COL048 VARCHAR2(3), COL049 VARCHAR2(8), COL050 VARCHAR2(8), COL051 VARCHAR2(5)
        , COL052 VARCHAR2(2), COL053 VARCHAR2(2), COL054 VARCHAR2(3), COL055 VARCHAR2(7) )
    STORAGE( DATAOBJNO 68474 );
--恢复删除表(业务提供表结构)
DUL> unload table t_dul_drop(
  2  OWNER                              VARCHAR2(30),
  3  TABLE_NAME                         VARCHAR2(30),
  4  TABLESPACE_NAME                                    VARCHAR2(30),
  5  CLUSTER_NAME                                       VARCHAR2(30),
  6  IOT_NAME                                           VARCHAR2(30),
  7  STATUS                                             VARCHAR2(8) ,
  8  PCT_FREE                                           NUMBER      ,
  9  PCT_USED                                           NUMBER      ,
 10  INI_TRANS                                          NUMBER      ,
 11  MAX_TRANS                                          NUMBER      ,
 12  INITIAL_EXTENT                                     NUMBER      ,
 13  NEXT_EXTENT                                        NUMBER      ,
 14  MIN_EXTENTS                                        NUMBER      ,
 15  MAX_EXTENTS                                        NUMBER      ,
 16  PCT_INCREASE                                       NUMBER      ,
 17  FREELISTS                                          NUMBER      ,
 18  FREELIST_GROUPS                                    NUMBER      ,
 19  LOGGING                                            VARCHAR2(3) ,
 20  BACKED_UP                                          VARCHAR2(1) ,
 21  NUM_ROWS                                           NUMBER      ,
 22  BLOCKS                                             NUMBER      ,
 23  EMPTY_BLOCKS                                       NUMBER      ,
 24  AVG_SPACE                                          NUMBER      ,
 25  CHAIN_CNT                                          NUMBER      ,
 26  AVG_ROW_LEN                                        NUMBER      ,
 27  AVG_SPACE_FREELIST_BLOCKS                          NUMBER      ,
 28  NUM_FREELIST_BLOCKS                                NUMBER      ,
 29  DEGREE                                             VARCHAR2(20),
 30  INSTANCES                                          VARCHAR2(20),
 31  CACHE                                              VARCHAR2(10),
 32  TABLE_LOCK                                         VARCHAR2(8) ,
 33  SAMPLE_SIZE                                        NUMBER      ,
 34  LAST_ANALYZED                                      DATE        ,
 35  PARTITIONED                                        VARCHAR2(3) ,
 36  IOT_TYPE                                           VARCHAR2(12),
 37  TEMPORARY                                          VARCHAR2(1) ,
 38  SECONDARY                                          VARCHAR2(1) ,
 39  NESTED                                             VARCHAR2(3) ,
 40  BUFFER_POOL                                        VARCHAR2(7) ,
 41  FLASH_CACHE                                        VARCHAR2(7) ,
 42  CELL_FLASH_CACHE                                   VARCHAR2(7) ,
 43  ROW_MOVEMENT                                       VARCHAR2(8) ,
 44  GLOBAL_STATS                                       VARCHAR2(3) ,
 45  USER_STATS                                         VARCHAR2(3) ,
 46  DURATION                                           VARCHAR2(15),
 47  SKIP_CORRUPT                                       VARCHAR2(8) ,
 48  MONITORING                                         VARCHAR2(3) ,
 49  CLUSTER_OWNER                                      VARCHAR2(30),
 50  DEPENDENCIES                                       VARCHAR2(8) ,
 51  COMPRESSION                                        VARCHAR2(8) ,
 52  COMPRESS_FOR                                       VARCHAR2(12),
 53  DROPPED                                            VARCHAR2(3) ,
 54  READ_ONLY                                          VARCHAR2(3) ,
 55  SEGMENT_CREATED                                    VARCHAR2(3) ,
 56  RESULT_CACHE                                       VARCHAR2(7))
 57   STORAGE( DATAOBJNO 68474 );
. unloading table                T_DUL_DROP
DUL: Warning: Recreating file "T_DUL_DROP.ctl"
    1785 rows unloaded

模拟业务规则提供,创建表

SQL> create table t_dul_drop as select * from t_dul_drop_bak where 1=0;
Table created.

导入数据

e:\dul10>sqlldr chf/xifenfei control=T_DUL_DROP.ctl
SQL*Loader: Release 11.2.0.3.0 - Production on Mon Feb 4 23:35:57 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Commit point reached - logical record count 64
Commit point reached - logical record count 128
Commit point reached - logical record count 192
Commit point reached - logical record count 256
Commit point reached - logical record count 320
Commit point reached - logical record count 384
Commit point reached - logical record count 448
Commit point reached - logical record count 512
Commit point reached - logical record count 576
Commit point reached - logical record count 640
Commit point reached - logical record count 704
Commit point reached - logical record count 768
Commit point reached - logical record count 832
Commit point reached - logical record count 896
Commit point reached - logical record count 960
Commit point reached - logical record count 1024
Commit point reached - logical record count 1088
Commit point reached - logical record count 1152
Commit point reached - logical record count 1216
Commit point reached - logical record count 1280
Commit point reached - logical record count 1344
Commit point reached - logical record count 1408
Commit point reached - logical record count 1472
Commit point reached - logical record count 1536
Commit point reached - logical record count 1600
Commit point reached - logical record count 1664
Commit point reached - logical record count 1728
Commit point reached - logical record count 1785

恢复数据结果

SQL> select count(*) from t_dul_drop;
  COUNT(*)
----------
      1785
SQL> select owner,table_name from t_dul_drop where rownum<10;
OWNER                          TABLE_NAME
------------------------------ ------------------------------
SYS                            IDL_CHAR$
SYS                            IDL_UB2$
SYS                            IDL_SB4$
SYS                            ERROR$
SYS                            SETTINGS$
SYS                            NCOMP_DLL$
SYS                            PROCEDUREJAVA$
SYS                            PROCEDUREC$
SYS                            PROCEDUREPLSQL$
9 rows selected.

dul恢复truncate表测试

dul 恢复truncate 测试
准备dul测试

SQL> select count(*) from t_xifenfei;
  COUNT(*)
----------
     67854
SQL> desc t_xifenfei
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 OWNER                                              VARCHAR2(30)
 OBJECT_NAME                                        VARCHAR2(128)
 SUBOBJECT_NAME                                     VARCHAR2(30)
 OBJECT_ID                                          NUMBER
 DATA_OBJECT_ID                                     NUMBER
 OBJECT_TYPE                                        VARCHAR2(19)
 CREATED                                            DATE
 LAST_DDL_TIME                                      DATE
 TIMESTAMP                                          VARCHAR2(19)
 STATUS                                             VARCHAR2(7)
 TEMPORARY                                          VARCHAR2(1)
 GENERATED                                          VARCHAR2(1)
 SECONDARY                                          VARCHAR2(1)
 NAMESPACE                                          NUMBER
 EDITION_NAME                                       VARCHAR2(30)
SQL> select object_id,data_object_id from user_objects where object_name='T_XIFENFEI';
 OBJECT_ID DATA_OBJECT_ID
---------- --------------
     68332          68332
SQL> TRUNCATE TABLE T_XIFENFEI;
Table truncated.
SQL> ALTER SYSTEM CHECKPOINT;
System altered.
SQL> select object_id,data_object_id from user_objects where object_name='T_XIFE
NFEI';
 OBJECT_ID DATA_OBJECT_ID
---------- --------------
     68332          68468
SQL> SELECT TABLESPACE_NAME FROM DBA_SEGMENTS WHERE SEGMENT_NAME='T_XIFENFEI';
TABLESPACE_NAME
------------------------------
USERS
SQL> SELECT FILE_ID FROM DBA_DATA_FILES WHERE TABLESPACE_NAME='USERS';
   FILE_ID
----------
         4
SQL> SELECT file#,RFILE#,NAME FROM V$DATAFILE;
     FILE#     RFILE# NAME
---------- ---------- --------------------------------------------------
         1          1 E:\ORACLE\ORADATA\XIFENFEI\SYSTEM01.DBF
         2          2 E:\ORACLE\ORADATA\XIFENFEI\SYSAUX01.DBF
         3          3 E:\ORACLE\ORADATA\XIFENFEI\UNDOTBS01.DBF
         4          4 E:\ORACLE\ORADATA\XIFENFEI\USERS01.DBF
         5          5 E:\ORACLE\ORADATA\XIFENFEI\UNDO01.DBF
         6          6 E:\ORACLE\ORADATA\XIFENFEI\CZUM01.DBF
6 rows selected.

备注说明:因为我们为了测试,所以直接查询出来了data_object_id,在实际的恢复中,我们需要使用logminer来找出来历史dataobj#

dul恢复truncate 表

e:\dul10>dul.exe
Data UnLoader 10.2.4.37 - Oracle Internal Only - on Mon Feb 04 00:20:08 2013
with 64-bit io functions
Copyright (c) 1994 2010 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal use Only
DUL: Warning: Recreating file "dul.log"
Reading USER.dat 65 entries loaded
Reading OBJ.dat 67944 entries loaded and sorted 67944 entries
Reading SCANNEDLOBPAGE.dat 8 entries loaded and sorted 8 entries
Reading TAB.dat 1869 entries loaded
Reading COL.dat 77409 entries loaded and sorted 77409 entries
Reading SEG.dat 23 entries loaded
Reading EXT.dat 54 entries loaded and sorted 54 entries
Reading TABPART.dat 110 entries loaded and sorted 110 entries
Reading TABCOMPART.dat 1 entries loaded and sorted 1 entries
Reading TABSUBPART.dat 32 entries loaded and sorted 32 entries
Reading INDPART.dat 127 entries loaded and sorted 127 entries
Reading INDCOMPART.dat 0 entries loaded and sorted 0 entries
Reading INDSUBPART.dat 0 entries loaded and sorted 0 entries
Reading IND.dat 3729 entries loaded
Reading LOB.dat
DUL: Warning: Increased the size of DC_LOBS from 1024 to 8192 entries
 1109 entries loaded
Reading ICOL.dat 4868 entries loaded
Reading COLTYPE.dat 2823 entries loaded
Reading TYPE.dat 2082 entries loaded
Reading ATTRIBUTE.dat 8736 entries loaded
Reading COLLECTION.dat 591 entries loaded
Reading COMPATSEG.dat 0 entries loaded
Reading BOOTSTRAP.dat 60 entries loaded
Reading LOBFRAG.dat 1 entries loaded and sorted 1 entries
Reading LOBCOMPPART.dat 0 entries loaded and sorted 0 entries
Reading UNDO.dat 24 entries loaded
Reading TS.dat 7 entries loaded
Reading PROPS.dat 36 entries loaded
Database character set is ZHS16GBK
Database national character set is AL16UTF16
Found db_id = 1422012639
Found db_name = XIFENFEI
DUL> SCAN DATAFILE 4;
DUL: Warning: Recreating file "EXT.dat"
DUL: Warning: Recreating file "SEG.dat"
DUL: Warning: Recreating file "COMPATSEG.dat"
DUL: Warning: Recreating file "SCANNEDLOBPAGE.dat"
Scanning tablespace 4, data file 4 ...
  23 segment header and 1601 data blocks
  tablespace 4, data file 4: 2079 blocks scanned
Reading EXT.dat 54 entries loaded and sorted 54 entries
Reading SEG.dat 23 entries loaded
Reading COMPATSEG.dat 0 entries loaded
Reading SCANNEDLOBPAGE.dat 8 entries loaded and sorted 8 entries
DUL>  ALTER SESSION SET USE_SCANNED_EXTENT_MAP = TRUE;
Parameter altered
DUL> scan extents;
DUL: Warning: Recreating file "seen_tab.dat"
DUL: Warning: Recreating file "seen_col.dat"
Scanning extents without segment header
Scanning extent id (dba 0, obj 68332)
Analyzing segment: data object id 68332
  heap organized table
Col    Seen  Max PCT  PRINT  NUMBERS DATES TIMESTAMP WITH TZ INTRVAL  ROWIDS LOB
 no   count Size NUL 75%100% AnyNice AnyNice AnyNice AnyNice Y2M D2S AnyNice
  1   67854   18   0 100 100   0   0   0   0   0   0   0   0   0   0  44   0   0
…………
 14   67854    2   0   0   0 100 100   0   0   0   0   0   0   0   0   0   0   0
|SYS| |C_FILE#_BLOCK#| || |8| |8| |CLUSTER| |20-JAN-2013 AD 17:13:18| |20-JAN-20
…………
:18| |2013-01-20:17:13:18| |VALID| |N| |N| |N| |1|
UNLOAD TABLE OBJNO68332 ( COL001 VARCHAR2(18), COL002 VARCHAR2(30), COL003 VARCH
AR2(27)
        , COL004 NUMBER, COL005 NUMBER, COL006 VARCHAR2(18), COL007 DATE
        , COL008 DATE, COL009 VARCHAR2(19), COL010 VARCHAR2(7), COL011 VARCHAR2(
1)
        , COL012 VARCHAR2(1), COL013 VARCHAR2(1), COL014 NUMBER )
    STORAGE( DATAOBJNO 68332 );
--自己拼接语句
DUL> UNLOAD TABLE t_xifenfei_1(OWNER VARCHAR2(30),OBJECT_NAME  VARCHAR2(128),SUB
OBJECT_NAME    VARCHAR2(30),OBJECT_ID  NUMBER,DATA_OBJECT_ID    NUMBER,OBJECT_TY
PE  VARCHAR2(19),CREATED    DATE,LAST_DDL_TIME DATE,TIMESTAMP    VARCHAR2(19),ST
ATUS VARCHAR2(7),TEMPORARY    VARCHAR2(1),GENERATED    VARCHAR2(1),SECONDARY
VARCHAR2(1),NAMESPACE    NUMBER,EDITION_NAME VARCHAR2(30))  STORAGE( DATAOBJNO 6
8332 );
. unloading table              T_XIFENFEI_1
DUL: Warning: Recreating file "T_XIFENFEI_1.ctl"
   67854 rows unloaded

补充说明:
1.在最新的dul 10.2.0.5.20中,不支持scan extents命令
2.在最新的dul 10.2.0.5.20中,挖出来的无数据字典数据,字符串转换为16进制值

table中各种类型block坏块是否能被跳过

在table遇到的各种坏块中,大部分情况,我们都可以通过设置event 10231或者dbms_repair来跳过坏块,抢救其他数据;但是在部分情况下,我们设置了他们依然不能跳过坏块,数据库依然报ORA-01578,本文测试了table中各种类型的block,证明在哪些blog出现异常之后不能被跳过.
如果是事务数据块出现坏块可以通过dbms_repair.skip_corrupt_blocks来标记坏块(也可以使用event 10231)来实现;对于BITMAP BLOCK如果出现坏块不会对于读取数据无影响(至于其他操作,请见后续blog);SEGMENT HEADER如果出现坏块,无法通过跳过坏块来实现获取其他数据(正常block)
创建测试表

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> create table t_xifenfei
  2  tablespace users
  3  as
  4  select * from dba_objects;
Table created.
SQL>  select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74663

查询相关block信息

SQL> select SEGMENT_NAME,HEADER_FILE,HEADER_BLOCK,blocks,extents from DBA_SEGMENTS
  2  WHERE OWNER='CHF' AND SEGMENT_NAME='T_XIFENFEI';
SEGMENT_NAME    HEADER_FILE HEADER_BLOCK     BLOCKS    EXTENTS
--------------- ----------- ------------ ---------- ----------
T_XIFENFEI                4          378       1152         24
SQL>   select
  2    dbms_rowid.rowid_relative_fno(rowid)rel_fno,
  3    max(dbms_rowid.rowid_block_number(rowid)) max_block,
  4    min(dbms_rowid.rowid_block_number(rowid)) min_block
  5    from chf.t_xifenfei
  6    group by dbms_rowid.rowid_relative_fno(rowid);
   REL_FNO  MAX_BLOCK  MIN_BLOCK
---------- ---------- ----------
         4       1728        379
SQL> select EXTENT_ID,FILE_ID,BLOCK_ID,blocks from dba_extents where owner='CHF'
   2  AND SEGMENT_NAME='T_XIFENFEI';
 EXTENT_ID    FILE_ID   BLOCK_ID     BLOCKS
---------- ---------- ---------- ----------
         0          4        376          8
         1          4        640          8
         2          4        648          8
         3          4        656          8
         4          4        664          8
         5          4        672          8
         6          4        680          8
         7          4        688          8
         8          4        696          8
         9          4        704          8
        10          4        712          8
        11          4        720          8
        12          4        728          8
        13          4        736          8
        14          4        744          8
        15          4        752          8
        16          4        768        128
        17          4        896        128
        18          4       1024        128
        19          4       1152        128
        20          4       1280        128
        21          4       1408        128
        22          4       1536        128
        23          4       1664        128

通过这里可以知道:真正的存储数据是从block 379开始,至于block 376、377、378是什么,使用dump block分析

验证block类型

SQL> alter system dump datafile 4 block 376;
System altered.
SQL> alter system dump datafile 4 block 377;
System altered.
SQL> alter system dump datafile 4 block 378;
System altered.
SQL> alter system dump datafile 4 block 379;
System altered.
--该block是另外extent的开始,所以也尝试分析是否有特殊之处
SQL> alter system dump datafile 4 block 640;
System altered.
--dump 文件header信息
Start dump data blocks tsn: 4 file#:4 minblk 376 maxblk 376
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777592
Block dump from disk:
buffer tsn: 4 rdba: 0x01000178 (4/376)
scn: 0x0b8c.3bfc6517 seq: 0x04 flg: 0x04 tail: 0x65172004
frmt: 0x02 chkval: 0xc8b3 type: 0x20=FIRST LEVEL BITMAP BLOCK
Start dump data blocks tsn: 4 file#:4 minblk 377 maxblk 377
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777593
Block dump from disk:
buffer tsn: 4 rdba: 0x01000179 (4/377)
scn: 0x0b8c.3bfc6517 seq: 0x18 flg: 0x04 tail: 0x65172118
frmt: 0x02 chkval: 0x9e8c type: 0x21=SECOND LEVEL BITMAP BLOCK
Start dump data blocks tsn: 4 file#:4 minblk 378 maxblk 378
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777594
BH (0x2a7f7f0c) file#: 4 rdba: 0x0100017a (4/378) class: 4 ba: 0x2a742000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 1 pwc: 0,0
  dbwrid: 0 obj: 76372 objn: 76372 tsn: 4 afn: 4 hint: f
  hash: [0x3150a748,0x3150a748] lru: [0x2a7f8094,0x2a7f7ee4]
  lru-flags: hot_buffer
  ckptq: [NULL] fileq: [NULL] objq: [0x2f72dc34,0x2f72dc34] objaq: [0x2f72dc2c,0x2f72dc2c]
  st: XCURRENT md: NULL fpin: 'ktewh25: kteinicnt' tch: 1
  flags:
  LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
Block dump from disk:
buffer tsn: 4 rdba: 0x0100017a (4/378)
scn: 0x0b8c.3bfc651b seq: 0x01 flg: 0x04 tail: 0x651b2301
frmt: 0x02 chkval: 0xb2ae type: 0x23=PAGETABLE SEGMENT HEADER
Start dump data blocks tsn: 4 file#:4 minblk 379 maxblk 379
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777595
Block dump from disk:
buffer tsn: 4 rdba: 0x0100017b (4/379)
scn: 0x0b8c.3bfc6494 seq: 0x01 flg: 0x04 tail: 0x64940601
frmt: 0x02 chkval: 0x0567 type: 0x06=trans data
Start dump data blocks tsn: 4 file#:4 minblk 640 maxblk 640
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777856
Block dump from disk:
buffer tsn: 4 rdba: 0x01000280 (4/640)
scn: 0x0b8c.3bfc6496 seq: 0x01 flg: 0x04 tail: 0x64960601
frmt: 0x02 chkval: 0x0efe type: 0x06=trans data

这里可以知道:
1.block 376、377为BITMAP BLOCK
2.block 378为SEGMENT HEADER(和dba_segments视图中一致)
3.除extent 0中有特殊(含BITMAP BLOCK和SEGMENT HEADER)block,其他extent只包含事务数据

测试block 640

--block 640包含条数
SQL> select   count(rowid)
  2    from chf.t_xifenfei
  3     where dbms_rowid.rowid_block_number(rowid)=640
  4  and dbms_rowid.rowid_relative_fno(rowid)=4;
COUNT(ROWID)
------------
          79
--bbed修改tailchk
BBED> set filename '/u01/oracle/oradata/ora11g/users01.dbf'
        FILENAME        /u01/oracle/oradata/ora11g/users01.dbf
BBED> set block 640
        BLOCK#          640
BBED> set mode edit
        MODE            Edit
BBED> p tailchk
ub4 tailchk                                 @8188     0x64960601
BBED> m /x 64960602
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 640              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 64960602
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 640:
current = 0xf80b, required = 0xf80b
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 640
Block 640 is corrupt
Corrupt block relative dba: 0x01000280 (file 0, block 640)
Fractured block found during verification
Data in bad block:
 type: 6 format: 2 rdba: 0x01000280
 last change scn: 0x0b8c.3bfc6496 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x02069664
 check value in block header: 0xf80b
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
--查询坏块
SQL> select count(*) from chf.t_xifenfei;
select count(*) from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 640)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
--跳过坏块
SQL> exec dbms_repair.skip_corrupt_blocks('CHF','T_XIFENFEI');
PL/SQL procedure successfully completed.
SQL>  select skip_corrupt from dba_tables where table_name='T_XIFENFEI' AND OWNER='CHF';
SKIP_COR
--------
ENABLED
SQL>  select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74584
--修复坏块
BBED> m /x 01069664
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 640              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 01069664
 <32 bytes per line>
BBED> p tailchk
ub4 tailchk                                 @8188     0x64960601
BBED> sum apply
Check value for File 0, Block 640:
current = 0x0efe, required = 0x0efe
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 640
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED
--除掉标记表坏块
SQL> BEGIN
  2  DBMS_REPAIR.SKIP_CORRUPT_BLOCKS (
  3  SCHEMA_NAME => 'CHF',
  4  OBJECT_NAME => 'T_XIFENFEI',
  5  OBJECT_TYPE => dbms_repair.table_object,
  6  FLAGS => dbms_repair.NOSKIP_FLAG);
  7  END;
  8  /
PL/SQL procedure successfully completed.
SQL> select skip_corrupt from dba_tables where table_name='T_XIFENFEI' AND OWNER='CHF';
SKIP_COR
--------
DISABLED
--查询表记录正常
SQL>  select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74663

在后续的操作中,也是按照类似步骤操作,考虑到篇幅有限,部分过程不再贴出来

测试block 379

SQL>  select count(*) from chf.t_xifenfei;
 select count(*) from chf.t_xifenfei
                          *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 379)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
SQL> exec dbms_repair.skip_corrupt_blocks('CHF','T_XIFENFEI');
PL/SQL procedure successfully completed.
SQL> select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74575

测试block 378

BBED> set block 378
        BLOCK#          378
--segment header 不支持bbed查看结构
BBED> p tailchk
BBED-00400: invalid blocktype (35)
BBED> map
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 378                                   Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (35)
BBED> set offset 8188
        OFFSET          8188
BBED> d
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 378              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 01231b65
 <32 bytes per line>
BBED> m /x 651b2302
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 378              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 651b2302
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 378:
current = 0xedf2, required = 0xedf2
--验证坏块
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 378
Block 378 is corrupt
Corrupt block relative dba: 0x0100017a (file 0, block 378)
Fractured block found during verification
Data in bad block:
 type: 35 format: 2 rdba: 0x0100017a
 last change scn: 0x0b8c.3bfc651b seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x02231b65
 check value in block header: 0xedf2
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
SQL>  select count(*) from chf.t_xifenfei;
 select count(*) from chf.t_xifenfei
                          *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 378)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
--标记跳过坏块
SQL> exec dbms_repair.skip_corrupt_blocks('CHF','T_XIFENFEI');
PL/SQL procedure successfully completed.
--查询依然失败
SQL> select count(*) from chf.t_xifenfei;
select count(*) from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 378)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'

测试block 377

BBED> m /x 18211766
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 377              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 18211766
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 377:
current = 0x9d8c, required = 0x9d8c
--bbed验证为坏块
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 377
Block 377 is corrupt
Corrupt block relative dba: 0x01000179 (file 0, block 377)
Fractured block found during verification
Data in bad block:
 type: 33 format: 2 rdba: 0x01000179
 last change scn: 0x0b8c.3bfc6517 seq: 0x18 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x66172118
 check value in block header: 0x9d8c
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
--dbv验证为坏块
[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY: Release 11.2.0.3.0 - Production on Fri Jan 18 03:32:18 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/users01.dbf
Page 377 is influx - most likely media corrupt
Corrupt block relative dba: 0x01000179 (file 4, block 377)
Fractured block found during dbv:
Data in bad block:
 type: 33 format: 2 rdba: 0x01000179
 last change scn: 0x0b8c.3bfc6517 seq: 0x18 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x66172118
 check value in block header: 0x9d8c
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Pages Examined         : 2560
Total Pages Processed (Data) : 1434
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 10
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 213
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 902
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 1
Total Pages Encrypted        : 0
Highest block SCN            : 1006486374 (2956.1006486374)
--查询表记录
SQL> select skip_corrupt from dba_tables where table_name='T_XIFENFEI' AND OWNER='CHF';
SKIP_COR
--------
DISABLED
SQL> select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74663

测试block 376

BBED> m /x 04201766
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 376              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 04201766
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 376:
current = 0xcbb3, required = 0xcbb3
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 376
Block 376 is corrupt
Corrupt block relative dba: 0x01000178 (file 0, block 376)
Fractured block found during verification
Data in bad block:
 type: 32 format: 2 rdba: 0x01000178
 last change scn: 0x0b8c.3bfc6517 seq: 0x4 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x66172004
 check value in block header: 0xcbb3
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
SQL>  select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74663

通过测试证明,如果是事务数据块出现坏块可以通过dbms_repair.skip_corrupt_blocks来标记坏块(也可以使用event 10231)来实现;对于BITMAP BLOCK如果出现坏块不会对于读取数据无影响(至于其他操作,请见table中各种坏块对select/dml操作影响);SEGMENT HEADER如果出现坏块,无法通过跳过坏块来实现获取其他数据(正常block)

操作系统级别做systemstate

在有些时候,我们需要通过systemstate或者hanganalyze来收集数据库hang信息,但是当我们的数据库不能登录的时候,无法通过常规手段来使用该方法来收集信息。ORACLE的开发者也考虑到了类似情况,让我们可以通过dbx或者gdb来收集数据库信息.这里我通过在linux平台上模拟一个新会话,然后通过gdb直接对该会话进行做systemstate,然后操作该会话(证明gdb操作后正常),来实现类似系统hang住,无法登陆的时候怎么做systemstate.
模拟会话
就是假设我们hang住的系统中的数据库进程(为了后面gdb和验证对会话影响而使用,不然可以直接使用oracle 后台进程来完成该操作)

[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Thu Jan 17 18:28:30 2013
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL>

查找数据库进程
在另外一个shell窗口找出数据库进程

[root@xifenfei trace]# ps -ef|grep LOCAL
oracle    7476  7473  0 18:28 ?        00:00:00 oracleora11g (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
root      7487  7326  0 18:28 pts/2    00:00:00 grep LOCAL

gdb做systemstate

[oracle@xifenfei trace]$ gdb $ORACLE_HOME/bin/oracle 7476
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-37.el5)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
…………
Reading symbols from /u01/oracle/oracle/product/11.2.0/db_1/lib/libnque11.so...
Loaded symbols for /u01/oracle/oracle/product/11.2.0/db_1/lib/libnque11.so
0xb789c424 in __kernel_vsyscall ()
(gdb)  print ksudss(10)  <--输入print ksudss(10)
$1 = 0
(gdb) quit
A debugging session is active.
        Inferior 1 [process 7476] will be detached.
Quit anyway? (y or n) y
Detaching from program: /u01/oracle/oracle/product/11.2.0/db_1/bin/oracle, process 7476

查看trace文件

[oracle@xifenfei trace]$ ls -l *7476*.trc
-rw-r----- 1 oracle oinstall 742438 Jan 17 18:28 ora11g_ora_7476.trc
[oracle@xifenfei trace]$ more ora11g_ora_7476.trc
Trace file /u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_7476.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/oracle/oracle/product/11.2.0/db_1
System name:    Linux
Node name:      xifenfei
Release:        2.6.32-200.13.1.el5uek
Version:        #1 SMP Wed Jul 27 20:21:26 EDT 2011
Machine:        i686
VM name:        VMWare Version: 6
Instance name: ora11g
Redo thread mounted by this instance: 1
Oracle process number: 21
Unix process pid: 7476, image: oracle@xifenfei (TNS V1-V3)
*** 2013-01-17 18:28:59.225
*** SESSION ID:(143.23) 2013-01-17 18:28:59.225
*** CLIENT ID:() 2013-01-17 18:28:59.225
*** SERVICE NAME:(SYS$USERS) 2013-01-17 18:28:59.225
*** MODULE NAME:(sqlplus@xifenfei (TNS V1-V3)) 2013-01-17 18:28:59.225
*** ACTION NAME:() 2013-01-17 18:28:59.225
===================================================
SYSTEM STATE (level=10)  <---systemstate levle 10
------------
System global information:
     processes: base 0x32bc5b38, size 150, cleanup 0x32bd5990
     allocation: free sessions 0x32085b84, free calls (nil)
     control alloc errors: 0 (process), 0 (session), 0 (call)
     PMON latch cleanup depth: 0
     seconds since PMON's last scan for dead processes: 693
     system statistics:

这里证明使用gdb–>print ksudss(10)对数据库做的是systemstate level 10

验证做gdb的进程

SQL> select * from dual;
D
-
X
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

证明被gdb处理过的进程还是正常(未收到额外破坏),所以我们可以确定在系统hang住,而且新会话不能正常建立连接的时候,我们可以尝试着使用ksudss(10)来对系统做systemstate然后分析

aix中procmap 查看oracle进程占用系统内存

procmap是用来显示进程地址空间,通过这个命令找出来的“read/write”表示为进程的私有内存,如果对应到oracle 进程的LOCAL中来,也就是对应了是oracle 会话进程占用的操作系统内存,和sga与pga无关,即ORACLE数据库进程占用的额外的系统内存,在计算oracle数据库消耗内存的时候,要考虑sga+pga+process占用的内存
procmap命令使用

$procmap 7931354
7931354 : oracleccicdx (LOCAL=NO)
100000000            95504K  read/exec         oracle
110000035             2399K  read/write        oracle
9fffffff0000000         51K  read/exec         /usr/ccs/bin/usla64
9fffffff000cfe2          0K  read/write        /usr/ccs/bin/usla64
900000000b05930          2K  read/exec         /usr/lib/libC.a[shr3_64.o]
9001000a0122930          0K  read/write        /usr/lib/libC.a[shr3_64.o]
900000000ae6b00        118K  read/exec         /usr/lib/libC.a[shrcore_64.o]
9001000a030a100         12K  read/write        /usr/lib/libC.a[shrcore_64.o]
900000000ac8000        118K  read/exec         /usr/lib/libC.a[ansicore_64.o]
9001000a0300e00         36K  read/write        /usr/lib/libC.a[ansicore_64.o]
900000000411468          0K  read/exec         /usr/lib/libicudata.a[shr_64.o]
9001000a0121468          0K  read/write        /usr/lib/libicudata.a[shr_64.o]
90000000040f738          2K  read/exec         /usr/lib/libC.a[shr2_64.o]
9001000a0314738          0K  read/write        /usr/lib/libC.a[shr2_64.o]
9000000008dd800       1699K  read/exec         /usr/lib/libC.a[ansi_64.o]
9001000a0315a00        277K  read/write        /usr/lib/libC.a[ansi_64.o]
9000000008bab00        135K  read/exec         /usr/lib/libC.a[shr_64.o]
9001000a030eb00         19K  read/write        /usr/lib/libC.a[shr_64.o]
900000000708180       1732K  read/exec         /usr/lib/libicuuc.a[shr_64.o]
9001000a035cdac        180K  read/write        /usr/lib/libicuuc.a[shr_64.o]
900000000493d80       2510K  read/exec         /usr/lib/libicui18n.a[shr_64.o]
9001000a038a148        270K  read/write        /usr/lib/libicui18n.a[shr_64.o]
900000000473200         91K  read/exec         /usr/lib/libsrc.a[shr_64.o]
9001000a01127a8         55K  read/write        /usr/lib/libsrc.a[shr_64.o]
90000000045a300         98K  read/exec         /usr/lib/libcorcfg.a[shr_64.o]
9001000a04147c8         18K  read/write        /usr/lib/libcorcfg.a[shr_64.o]
900000000b16200        750K  read/exec         /usr/lib/liblvm.a[shr_64.o]
9001000a03dd028        219K  read/write        /usr/lib/liblvm.a[shr_64.o]
900000000444f00         82K  read/exec         /usr/lib/libcfg.a[shr_64.o]
9001000a03d58f0         26K  read/write        /usr/lib/libcfg.a[shr_64.o]
90000000040e3a0          2K  read/exec         /usr/lib/libcrypt.a[shr_64.o]
9001000a0106948          0K  read/write        /usr/lib/libcrypt.a[shr_64.o]
90000001615d860          5K  read/exec         /usr/lib/libc.a[aio_64.o]
9001000a3aed568          0K  read/write        /usr/lib/libc.a[aio_64.o]
9000000003efc00        120K  read/exec         /usr/lib/libodm.a[shr_64.o]
9001000a0107cc8         40K  read/write        /usr/lib/libodm.a[shr_64.o]
900000000bd2c80        147K  read/exec         /usr/lib/libperfstat.a[shr_64.o]
9001000a041a960         14K  read/write        /usr/lib/libperfstat.a[shr_64.o]
9000000017d7000          0K  read/exec         /usr/lib/libdl.a[shr_64.o]
9001000a0517000          0K  read/write        /usr/lib/libdl.a[shr_64.o]
9000000158ed100       8636K  read/exec         /oracle/product/db10gr2/lib/libjox10.a[shr.o]
8001000a0000b78        587K  read/write        /oracle/product/db10gr2/lib/libjox10.a[shr.o]
900000000a87000        257K  read/exec         /usr/lib/libpthreads.a[shr_xpg5_64.o]
9001000a0274000        559K  read/write        /usr/lib/libpthreads.a[shr_xpg5_64.o]
900000000000800       4025K  read/exec         /usr/lib/libc.a[shr_64.o]
9001000a0000020       1047K  read/write        /usr/lib/libc.a[shr_64.o]
         Total      121863K

简化命令,统计私有内存,procmap 7931354|grep “read/write” |awk -F ” ” ‘{print $2}’,通过相关计算的出来,在当前的操作系统和数据库版本中,一个LOCAL=NO进程占用系统内存为:5758KB

补充说明
1.操作系统版本

$oslevel -r
6100-06

2.数据库版本

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Productio
NLSRTL Version 10.2.0.4.0 - Production

3.通过跟踪多个LOCAL=NO进程,发现类似进程占用的系统内存相同,估算给系统oracle进程占用的内存,可以通过该值进行大概估算
4.确认ORACLE使用的内存量不是以往认识的sga+pga,实际上应该是sga+pga+所有oracle进程占用
5.在linux中使用pmap来查看

设置_smu_debug_mode实现指定session级别使用特定回滚段

通过设置_smu_debug_mode值来实现指定session级别使用特定的回滚段
_smu_debug_mode为默认值

--测试数据库版本
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Solaris: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
--_smu_debug_mode值
SQL> select a.ksppinm name,b.ksppstvl value
  2    from x$ksppi a,x$ksppcv b
  3   where a.inst_id = USERENV ('Instance')
  4     and b.inst_id = USERENV ('Instance')
  5     and a.indx = b.indx
  6     and upper(a.ksppinm) LIKE upper('%&param%')
  7  order by name;
Enter value for param: _smu_debug_mode
old   6:    and upper(a.ksppinm) LIKE upper('%&param%')
new   6:    and upper(a.ksppinm) LIKE upper('%_smu_debug_mode%')
NAME                             VALUE
-------------------------------- ------------------------
_smu_debug_mode                  0
--undo管理模式
SQL> show parameter undo;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_management                      string      AUTO
undo_retention                       integer     900
undo_tablespace                      string      UNDOTBS1
--指定回滚段(查询dba_rollback_segs得到回滚段名称)
SQL>  set transaction use rollback segment "_SYSSMU7_1887299474$";
Transaction set.
SQL> delete from t where rownum<10;
9 rows deleted.
--查询使用回滚段
SQL> select XIDUSN from V$TRANSACTION;
    XIDUSN
----------
         9

这里可以看到在undo自动管理模式下,我们手工指定了回滚段但是被数据库给忽略,还是使用了系统自动分配的回滚段。例如这里我指定的回滚段7,但是使用了系统自动分配的回滚段9

_smu_debug_mode=45

SQL> select * from v$version;
BANNER
-------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for 32-bit Windows: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> show parameter undo;
NAME                                 TYPE                VALUE
------------------------------------ -------    -----------------
undo_management                      string       AUTO
undo_retention                       integer      900
undo_tablespace                      string       undo_new
SQL> select a.ksppinm name,b.ksppstvl value
  2    from x$ksppi a,x$ksppcv b
  3   where a.inst_id = USERENV ('Instance')
  4     and b.inst_id = USERENV ('Instance')
  5     and a.indx = b.indx
  6     and upper(a.ksppinm) LIKE upper('%&param%')
  7  order by name;
Enter value for param: _smu_debug_mode
old   6:    and upper(a.ksppinm) LIKE upper('%&param%')
new   6:    and upper(a.ksppinm) LIKE upper('%_smu_debug_mode%')
NAME                             VALUE
-------------------------------- ------------------------
_smu_debug_mode                  45
/*
使用alter system set "_smu_debug_mode" = 45;配置
注意:该参数只能在system级别配置
*/
--测试表
SQL> create table t_xifenfei
  2  as
  3  select * from dba_objects;
Table created.
--指定回滚段
SQL> set transaction use rollback segment "_SYSSMU15_1680736333$";
Transaction set.
SQL> delete from t_xifenfei where rownum<10;
9 rows deleted.
--查询事务回滚段
SQL> select XIDUSN from V$TRANSACTION;
    XIDUSN
----------
        15
SQL> commit;
Commit complete.
--再次指定回滚段
SQL> set transaction use rollback segment "_SYSSMU17_527554872$";
Transaction set.
SQL> delete from t_xifenfei where rownum<10;
9 rows deleted.
--查询事务回滚段
SQL> select XIDUSN from V$TRANSACTION;
    XIDUSN
----------
        17

这里可以看出来通过设置”_smu_debug_mode” = 45可以很好的实现在undo自动管理模式下,指定事务在特定的回滚段,在某些极限情况下,可以通过该操作来减少回滚段争用.

asmlib异常报ORA-00600[kfklLibFetchNext00]

一个朋友的历史库出现故障,在linux 4的平台上asm的10.2.0.1的单库,asm使用asmlib来处理。
asm不能正常mount磁盘组,可以看到asmdisk,alert日志报ORA-00600[kfklLibFetchNext00]
操作系统内核是:2.6.9-78
oracleasmlib是:2.0.2-1
asm磁盘组mount失败

--以前故障
SQL> ALTER DISKGROUP ALL MOUNT
Thu Sep  6 14:23:16 2012
NOTE: cache registered group DGARC number=1 incarn=0x2bf96274
NOTE: cache registered group DGDATA number=2 incarn=0x2c196275
NOTE: cache registered group DGSYS number=3 incarn=0x2c196276
Thu Sep  6 14:23:16 2012
Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_10204.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
Thu Sep  6 14:23:16 2012
Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_10204.trc:
ORA-15183: ASMLIB initialization error [/opt/oracle/extapi/64/asm/orcl/1/libasm.so]
ORA-15183: ASMLIB initialization error [driver/agent not installed]
Thu Sep  6 14:23:16 2012
ERROR: no PST quorum in group 1: required 2, found 0
Thu Sep  6 14:23:16 2012
NOTE: cache dismounting group 1/0x2BF96274 (DGARC)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGARC was not mounted
Thu Sep  6 14:23:16 2012
ERROR: no PST quorum in group 2: required 2, found 0
Thu Sep  6 14:23:16 2012
NOTE: cache dismounting group 2/0x2C196275 (DGDATA)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGDATA was not mounted
Thu Sep  6 14:23:16 2012
ERROR: no PST quorum in group 3: required 2, found 0
Thu Sep  6 14:23:16 2012
NOTE: cache dismounting group 3/0x2C196276 (DGSYS)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGSYS was not mounted
--现在故障
Thu Jan 24 13:49:45 2013
SQL> ALTER DISKGROUP ALL MOUNT
Thu Jan 24 13:49:45 2013
NOTE: cache registered group DGARC number=1 incarn=0xf388cee9
NOTE: cache registered group DGDATA number=2 incarn=0xf3a8ceea
NOTE: cache registered group DGSYS number=3 incarn=0xf3a8ceeb
Thu Jan 24 13:49:45 2013
Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_13449.trc:
ORA-00600: internal error code, arguments: [kfklLibFetchNext00],
[18446744073709551614], [0], [], [], [], [], []
Thu Jan 24 13:49:46 2013
Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_13449.trc:
ORA-00600: internal error code, arguments: [kfklLibFetchNext00],
[18446744073709551614], [0], [], [], [], [], []
Thu Jan 24 13:49:46 2013
ERROR: no PST quorum in group 1: required 2, found 0
Thu Jan 24 13:49:46 2013
NOTE: cache dismounting group 1/0xF388CEE9 (DGARC)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGARC was not mounted
Thu Jan 24 13:49:46 2013
ERROR: no PST quorum in group 2: required 2, found 0
Thu Jan 24 13:49:46 2013
NOTE: cache dismounting group 2/0xF3A8CEEA (DGDATA)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGDATA was not mounted
Thu Jan 24 13:49:46 2013
ERROR: no PST quorum in group 3: required 2, found 0
Thu Jan 24 13:49:46 2013
NOTE: cache dismounting group 3/0xF3A8CEEB (DGSYS)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGSYS was not mounted
Shutting down instance: further logons disabled

trace文件信息

----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+31          call     ksedst1()            000000000 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
ksedmp()+610         call     ksedst()             000000000 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
ksfdmp()+21          call     ksedmp()             000000003 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
kgerinv()+161        call     ksfdmp()             000000003 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
kgesinv()+33         call     kgerinv()            006469D40 ? 0064E1C58 ?
                                                   000000000 ? 000000000 ?
                                                   000000001 ? 000000001 ?
kgesinw()+166        call     kgesinv()            006469D40 ? 0064E1C58 ?
                                                   000000000 ? 000000000 ?
                                                   000000001 ? 000000001 ?
kfklLibScanNext()+2  call     kgesinw()            006469D40 ? 000000000 ?
39                                                 000000001 ? 000000000 ?
                                                   FFFFFFFFFFFFFFFE ?
                                                   000000000 ?
kfkLibFetchNext()+3  call     kfklLibScanNext()    0064DDD70 ? 7FBFFFDCD0 ?
43                                                 000000001 ? 000000000 ?
                                                   FFFFFFFFFFFFFFFE ?
                                                   000000000 ?
kfuitrnInit()+524    call     kfkLibFetchNext()    006469D40 ? 2A971DFF90 ?
                                                   000000001 ? 000000000 ?
                                                   FFFFFFFFFFFFFFFE ?
                                                   000000000 ?
kfkLibIterInit()+18  call     kfuitrnInit()        006469D40 ? 2A971DFCB0 ?
0                                                  2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
kfkLoadAllLibs()+36  call     kfkLibIterInit()     000000000 ? 00646C7E0 ?
3                                                  2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
kfkDiscoverString()  call     kfkLoadAllLibs()     000000000 ? 00646C7E0 ?
+107                                               2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
Cannot find symbol
Cannot find symbol
Cannot find symbol
kfdDiscoverString()  call     kfkDiscoverString()  067A53768 ? 00646C7E0 ?
+28                                                2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
kfdDiscoverShallow(  call     kfdDiscoverString()  067A53768 ? 000000000 ?
)+315                                              2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
kfgbDriver()+1174    call     kfdDiscoverShallow(  000000180 ? 000000000 ?
                              )                    2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
ksbabs()+564         call     kfgbDriver()         7FBFFFE5C0 ? 000000048 ?
                                                   000000000 ? 000000009 ?
                                                   000000009 ? 000000000 ?
ksbrdp()+727         call     ksbabs()             7FBFFFE5C0 ? 000000048 ?
                                                   000000000 ? 000000009 ?
                                                   000000009 ? 000000000 ?
opirip()+616         call     ksbrdp()             7FBFFFE5C0 ? 000000048 ?
                                                   000000001 ? 06002C770 ?
                                                   000000009 ? 000000000 ?
opidrv()+582         call     opirip()             000000032 ? 000000004 ?
                                                   7FBFFFF6C8 ? 06002C770 ?
                                                   000000009 ? 000000000 ?
sou2o()+114          call     opidrv()             000000032 ? 000000004 ?
                                                   7FBFFFF6C8 ? 06002C770 ?
                                                   000000009 ? 000000000 ?
opimai_real()+317    call     sou2o()              7FBFFFF6A0 ? 000000032 ?
                                                   000000004 ? 7FBFFFF6C8 ?
                                                   000000009 ? 000000000 ?
main()+116           call     opimai_real()        000000003 ? 7FBFFFF730 ?
                                                   000000004 ? 7FBFFFF6C8 ?
                                                   000000009 ? 000000000 ?
<0x3c9fb1c40b>       call     main()               000000003 ? 7FBFFFF730 ?
                                                   000000004 ? 7FBFFFF6C8 ?
                                                   000000009 ? 000000000 ?
--------------------- Binary Stack Dump ---------------------

因为客户的库是一个历史库,基本上不怎么使用,在2012年启动asm就出现了ORA-15183错误,然后在2013年重启机器后,再次启动asm就出现了ORA-00600[kfklLibFetchNext00]错误,通过2012年的错误提示,我们大概可以判断出来该问题和ASMLIB有关系,查询mos发现429945.1,发现Call Stack Trace完全一致,可以定位是该问题(如果想深入分析,可以通过strace继续分析)

ORA-600: [kfklLibFetchNext00], [18446744073709551614], [0] when mounting diskgroup in ASM

Applies to:
Linux OS - Version: 2.0.1-1 and later   [Release: RHEL4 and later ]
Information in this document applies to any platform.
Linux Kernel - Version: 2.0.1
Symptoms
 3 RAC db.
2 nodes are up and functioning except for 1 node - ASM did not come back up after
the reboot eventhough all disks show available from asmlib's perspective:
Changes
 All that was done with resources were stopped on Node1 and an extra LUN added.
 A reboot was then performed.
Cause
 The cause of the issue is libasm.o corruption
Ran the following to confirm that disks are ok:
/dev/oracleasm listdisks
/usr/sbin/asmtool -I -l /dev/oracleasm -n /dev/sdg1 -a label
/usr/sbin/oracleasm-discover 'ORCL:*'
dd if=/dev/sdg1 bs=8192 count=1 | od -c
==> output checked out fine
.
kfod asm_diskstring='ORCL:*'
==> this failed on Node1
KFOD-00600: file not found; argument [610][kfklLibFetchNext00] even though libasm.o exists
You might see the following call stack as well
----- Call Stack Trace -----
kfklLibScanNext
kfkLibFetchNext
kfuitrnInit
kfkLibIterInit
kfkLoadAllLibs
kfkDiscoverString
kfdDiscoverString
kfdDiscoverShallow
kfgbDriver
strace showed
 Node1-failing
-------
stat("/opt/oracle/extapi/64/asm/orcl/1/libasm.so", {st_mode=S_IFREG|0777, st_size=19344, ...}) = 0
 getdents64(4, /* 0 entries */, 4096) = 0 <<<<
 close(4) = 0
 open("/opt/oracle/product/10.2.0/db_1/rdbms/mesg/kfodus.msb", O_RDONLY) = -1
 ENOENT (No such file or directory)
 open("/opt/oracle/product/10.2.0/db_1/rdbms/mesg/kfodus.msb", O_RDONLY) = -1
 ENOENT (No such file or directory)
 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2a9750d000
write(1, "KFOD-00600: file not found; argu"..., 69) = 69
Node2-working
 -----
 stat("/opt/oracle/extapi/64/asm/orcl/1/libasm.so", {st_mode=S_IFREG|0755, st_size=19344, ...}) = 0
 open("/opt/oracle/extapi/64/asm/orcl/1/libasm.so", O_RDONLY) = 4
read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\23\0"..., 832) = 832
fstat(4, {st_mode=S_IFREG|0755, st_size=19344, ...}) = 0
mmap(NULL, 1066104, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4, 0) 0x2a9750d000

通过MOS的描述,可以明确定位到问题是:libasm.o异常导致

解决方案

To implement the solution, reinstall the ASMlib RPM
>rpm -Uvh oracleasmlib-2.0.0-1
This replaces the /opt/oracle/extapi/64/asm/orcl/1/libasm.so

aix使用太多内存导致shared pool 相关latch异常

某客户有一服务器,shared pool 相关latch出现异常等待,影响系统性能.分析结果:因为系统空闲内存太少,使用太多Paging Space导致该异常;解决办法:1.增加内存,2.在业务接受范围内减小sga等其他和内存消耗相关参数
nmon查看剩余内存

x          Physical  PageSpace |        pages/sec  In     Out | FileSystemCache
x% Used       99.8%     34.9%  | to Paging Space   0.0    0.0 | (numperm) 14.8%
x% Free        0.2%     65.1%  | to File System    0.0   33.0 | Process   63.9%
xMB Used   21452.8MB 11446.1MB | Page Scans        0.0        | System    21.1%
xMB Free (少)-->51.2MB 21321.9MB | Page Cycles       0.0      | Free       0.2%
xTotal(MB) 21504.0MB 32768.0MB | Page Steals       0.0        |           ------

topas查看内存配置

Disk    Busy%     KBPS     TPS KB-Read KB-Writ                   MEMORY
Topas Monitor for host:    p570b03              EVENTS/QUEUES    FILE/TTY
Wed Jan  9 13:30:30 2013   Interval:  2         Cswitch     785  Readch   173.1K
                                                Syscall   54407  Writech  213.1K
CPU  User%  Kern%  Wait%  Idle%                 Reads       118  Rawin         0
ALL   43.6    1.7    0.0   54.8                 Writes      110  Ttyout      352
                                                Forks         0  Igets         0
Network  KBPS   I-Pack  O-Pack   KB-In  KB-Out  Execs         0  Namei         5
Total    84.5    146.0   200.6    26.4    58.1  Runqueue    0.5  Dirblk        0
                                                Waitqueue   0.0
Disk    Busy%     KBPS     TPS KB-Read KB-Writ                   MEMORY
Total     0.0    164.6    17.0     0.0   164.6  PAGING           Real,MB   21504
                                                Faults    12408  % Comp     86 <---大部分计算内存
FileSystem        KBPS     TPS KB-Read KB-Writ  Steals        0  % Noncomp  13 <---fs cache较少
Total            316.3    17.9  151.5  164.9    PgspIn        0  % Client   13
                                                PgspOut       0
Name            PID  CPU%  PgSp Owner           PageIn        0  PAGING SPACE
oracle      6357252  16.7   8.4 oracle          PageOut      42  Size,MB   32768
oracle     15401474   8.3  70.8 oracle          Sios         42  % Used     35  <---使用比较多
oracle     12714542   8.3   8.3 oracle                           % Free     65
oracle      5767556   8.3   8.3 oracle          NFS (calls/sec)
oracle      5898996   8.3 134.9 oracle          SerV2         0  WPAR Activ    0
oracle     17629634   8.3 134.9 oracle          CliV2         0  WPAR Total    0
oracle     13959694   0.0   8.4 oracle          SerV3         0  Press: "h"-help
oracle      5439860   0.0 134.3 oracle          CliV3         0         "q"-quit

内存参数配置

vmo -F -a
--数据库相关参数
minperm% = 3
v_pinshm = 0
lru_file_repage = 0
maxclient% = 90
maxperm% = 90
strict_maxclient = 1
strict_maxperm = 0
page_steal_method = 1

因为是AIX 6.1,这里的vmo配置基本上是oracle 推荐值(大页没有配置,非必须选项)

会话进程占用内存

procmap 15466998
15466998 : oraclewasudb (LOCAL=NO)
100000000            97466K  read/exec         oracle
11000088d             2430K  read/write        oracle
9fffffff0000000         51K  read/exec         /usr/ccs/bin/usla64
9fffffff000cfe2          0K  read/write        /usr/ccs/bin/usla64
900000000b14930          2K  read/exec         /usr/lib/libC.a[shr3_64.o]
9001000a0122930          0K  read/write        /usr/lib/libC.a[shr3_64.o]
900000000af5b00        118K  read/exec         /usr/lib/libC.a[shrcore_64.o]
9001000a0319100         12K  read/write        /usr/lib/libC.a[shrcore_64.o]
900000000ad7000        118K  read/exec         /usr/lib/libC.a[ansicore_64.o]
9001000a030fe00         36K  read/write        /usr/lib/libC.a[ansicore_64.o]
900000000411468          0K  read/exec         /usr/lib/libicudata.a[shr_64.o]
9001000a0121468          0K  read/write        /usr/lib/libicudata.a[shr_64.o]
90000000040f738          2K  read/exec         /usr/lib/libC.a[shr2_64.o]
9001000a0323738          0K  read/write        /usr/lib/libC.a[shr2_64.o]
9000000008ec800       1699K  read/exec         /usr/lib/libC.a[ansi_64.o]
9001000a0324a00        277K  read/write        /usr/lib/libC.a[ansi_64.o]
9000000008c9b00        135K  read/exec         /usr/lib/libC.a[shr_64.o]
9001000a031db00         19K  read/write        /usr/lib/libC.a[shr_64.o]
900000000708180       1732K  read/exec         /usr/lib/libicuuc.a[shr_64.o]
9001000a036bdac        180K  read/write        /usr/lib/libicuuc.a[shr_64.o]
900000000493d80       2510K  read/exec         /usr/lib/libicui18n.a[shr_64.o]
9001000a0399148        270K  read/write        /usr/lib/libicui18n.a[shr_64.o]
900000000473200         91K  read/exec         /usr/lib/libsrc.a[shr_64.o]
9001000a01127a8         55K  read/write        /usr/lib/libsrc.a[shr_64.o]
90000000045a300         98K  read/exec         /usr/lib/libcorcfg.a[shr_64.o]
9001000a04147c8         18K  read/write        /usr/lib/libcorcfg.a[shr_64.o]
900000000b16200        750K  read/exec         /usr/lib/liblvm.a[shr_64.o]
9001000a03dd028        219K  read/write        /usr/lib/liblvm.a[shr_64.o]
900000000444f00         82K  read/exec         /usr/lib/libcfg.a[shr_64.o]
9001000a027b8f0         26K  read/write        /usr/lib/libcfg.a[shr_64.o]
90000000040e3a0          2K  read/exec         /usr/lib/libcrypt.a[shr_64.o]
9001000a0106948          0K  read/write        /usr/lib/libcrypt.a[shr_64.o]
90000000233c860          5K  read/exec         /usr/lib/libc.a[aio_64.o]
9001000a0437568          0K  read/write        /usr/lib/libc.a[aio_64.o]
9000000003efc00        120K  read/exec         /usr/lib/libodm.a[shr_64.o]
9001000a0107cc8         40K  read/write        /usr/lib/libodm.a[shr_64.o]
900000000bd2c80        147K  read/exec         /usr/lib/libperfstat.a[shr_64.o]
9001000a041a960         14K  read/write        /usr/lib/libperfstat.a[shr_64.o]
900000000bf8000          0K  read/exec         /usr/lib/libdl.a[shr_64.o]
9001000a041f000          0K  read/write        /usr/lib/libdl.a[shr_64.o]
9000000024ac100       8680K  read/exec         /oracle/product/10g/lib/libjox10.a[shr.o]
8001000a0000ca0        588K  read/write        /oracle/product/10g/lib/libjox10.a[shr.o]
900000000a96000        257K  read/exec         /usr/lib/libpthreads.a[shr_xpg5_64.o]
9001000a0283000        559K  read/write        /usr/lib/libpthreads.a[shr_xpg5_64.o]
900000000000800       4025K  read/exec         /usr/lib/libc.a[shr_64.o]
9001000a0000020       1047K  read/write        /usr/lib/libc.a[shr_64.o]
         Total      123902K

在上表中,标记为read/write的内存即是进程的私有内存,每个会话大概占用内存近6M,数据库大概有80多个会话,占用内存大概,占用内存大概500M左右.

数据库参数配置

SQL> select sum(PGA_ALLOC_MEM)/1024/1024/1024,count(*) from v$process;
SUM(PGA_ALLOC_MEM)/1024/1024/1024   COUNT(*)
--------------------------------- ----------
                       2.46758329         84
SQL> show parameter pga;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
_pga_max_size                        big integer 500M
pga_aggregate_target                 big integer 2000M
SQL> show parameter sga
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 12000M
sga_target                           big integer 10000M
SQL> show sga;
Total System Global Area 1.2583E+10 bytes
Fixed Size                  2117744 bytes
Variable Size            7600082832 bytes
Database Buffers         4966055936 bytes
Redo Buffers               14655488 bytes

当前系统整体ORACLE使用内存汇总:sga 12G+pga 2.5G+process 0.5G,大概占用内存15G,留给系统内存6G左右,系统使用大量交换分区,导致系统性能下降,最明显的为:shared pool相关latch等待异常,具体awr为:


因为系统因为个别session需要大量内存设置_pga_max_size参数,导致部分会话系统占用2.5g内存,建议设置该参数为默认值,并对个别会话独立设置,设置pga_aggregate_target=1.5G,sga_target=sga_max_size=8.5G,awr结果为:

SQL TUNING导致ORA-07445[qsmmixComputeClusteringFactor()+386]

在11.2.0.2版本中sql tuning可能导致ORA-07445[qsmmixComputeClusteringFactor()+386]错误
系统版本和平台信息

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /oracle/product/11.2.0/dbhome_1
System name:	Linux
Node name:	FPMS01DB
Release:	2.6.18-238.el5
Version:	#1 SMP Sun Dec 19 14:22:44 EST 2010
Machine:	x86_64
VM name:	VMWare Version: 6
Instance name: tis
Redo thread mounted by this instance: 1
Oracle process number: 41
Unix process pid: 16822, image: oracle@FPMS01DB (J003)

alert日志信息

Thu Jan 10 22:00:02 2013
Begin automatic SQL Tuning Advisor run for special tuning task  "SYS_AUTO_SQL_TUNING_TASK"
Thu Jan 10 22:02:39 2013
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x4]
[PC:0x597FEB4, qsmmixComputeClusteringFactor()+386] [flags: 0x0, count: 1]
Errors in file /oracle/diag/rdbms/tis/tis/trace/tis_j001_11073.trc  (incident=80033):
ORA-07445: exception encountered: core dump [qsmmixComputeClusteringFactor()+386]
[SIGSEGV] [ADDR:0x4] [PC:0x597FEB4] [Address not mapped to object] []
Incident details in: /oracle/diag/rdbms/tis/tis/incident/incdir_80033/tis_j001_11073_i80033.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Thu Jan 10 22:02:51 2013
Dumping diagnostic data in directory=[cdmp_20130110220251],
requested by (instance=1, osid=11073 (J001)), summary=[incident=80033].

通过这里可以看出来,出现ORA-07445错误的原因很可能和SQL TUNING相关

trace文件信息

Dump continued from file: /oracle/diag/rdbms/tis/tis/trace/tis_j003_16822.trc
ORA-07445: exception encountered: core dump [qsmmixComputeClusteringFactor()+386]
[SIGSEGV] [ADDR:0x4] [PC:0x597FEB4] [Address not mapped to object] []
========= Dump for incident 80008 (ORA 7445 [qsmmixComputeClusteringFactor()+386]) ========
----- Beginning of Customized Incident Dump(s) -----
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x4]
[PC:0x597FEB4, qsmmixComputeClusteringFactor()+386] [flags: 0x0, count: 1]
Registers:
%rax: 0x00002b0b684593d0 %rbx: 0x0000000000000003 %rcx: 0x0000000000000000
%rdx: 0x00002b0b684593d0 %rdi: 0x00007fff758ceff8 %rsi: 0x0000000000000000
%rsp: 0x00007fff758ced00 %rbp: 0x00007fff758cf5d0  %r8: 0x00002b0b684593a8
 %r9: 0x00002b0b683c8e60 %r10: 0x0000000000000017 %r11: 0x0000000000000000
%r12: 0x00002b0b684197a0 %r13: 0x00002b0b68419928 %r14: 0x00002b0b66ad2430
%r15: 0x00002b0b66d95590 %rip: 0x000000000597feb4 %efl: 0x0000000000010202
  qsmmixComputeClusteringFactor()+366 (0x597fea0) call 0x597fc62
  qsmmixComputeClusteringFactor()+371 (0x597fea5) lea -0x5d8(%rbp),%rdi
  qsmmixComputeClusteringFactor()+378 (0x597feac) mov 0x60(%rax),%rsi
  qsmmixComputeClusteringFactor()+382 (0x597feb0) mov 0x60(%rax),%rcx
> qsmmixComputeClusteringFactor()+386 (0x597feb4) movzwl 0x4(%rcx),%edx
  qsmmixComputeClusteringFactor()+390 (0x597feb8) add $6,%rsi
  qsmmixComputeClusteringFactor()+394 (0x597febc) mov %rax,-0x20(%rbp)
  qsmmixComputeClusteringFactor()+398 (0x597fec0) call 0xa056f0
  qsmmixComputeClusteringFactor()+403 (0x597fec5) mov -0x8(%rbp),%edx
*** 2013-01-09 22:50:24.205
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x3, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=durgh9p25f6bb) -----
/* SQL Analyze(788,1) */ SELECT  distinct
			                   null  as isdistribution,
			                   null as seqno,
			                   to_char(T1.SEQNO) as TRADEID
…………
from T_XIFENFEI
----- PL/SQL Stack -----
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0xc5b644e8     11816  package body SYS.DBMS_SQLTUNE_INTERNAL
0xc54ca918         7  SYS.WRI$_ADV_SQLTUNE
0xd55aec98       587  package body SYS.PRVT_ADVISOR
0xd55aec98      2655  package body SYS.PRVT_ADVISOR
0xc5f66c70       241  package body SYS.DBMS_ADVISOR
0xc5418dc8       821  package body SYS.DBMS_SQLTUNE
0xd50d38c0         4  anonymous block
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+36        call     kgdsdst()            000000000 ? 000000000 ?
                                                   2B0B65B6C098 ? 000000001 ?
                                                   000000001 ? 000000003 ?
ksedst1()+98         call     skdstdst()           000000000 ? 000000000 ?
                                                   2B0B65B6C098 ? 000000001 ?
                                                   000000000 ? 000000003 ?
ksedst()+34          call     ksedst1()            000000001 ? 000000001 ?
                                                   2B0B65B6C098 ? 000000001 ?
                                                   000000000 ? 000000003 ?
dbkedDefDump()+2741  call     ksedst()             000000001 ? 000000001 ?
                                                   2B0B65B6C098 ? 000000001 ?
                                                   000000000 ? 000000003 ?
ksedmp()+36          call     dbkedDefDump()       000000003 ? 000000003 ?
                                                   2B0B65B6C098 ? 000000001 ?
                                                   000000000 ? 000000003 ?
ssexhd()+2366        call     ksedmp()             000000003 ? 000000003 ?
                                                   2B0B65B6C098 ? 000000001 ?
                                                   000000000 ? 000000003 ?
__sighandler()       call     ssexhd()             00000000B ? 2B0B65B74D70 ?
                                                   2B0B65B74C68 ? 000000001 ?
                                                   000000000 ? 000000003 ?
qsmmixComputeCluste  signal   __sighandler()       7FFF758CEFF8 ? 000000000 ?
ringFactor()+386                                   2B0B684593D0 ? 000000000 ?
                                                   2B0B684593A8 ? 2B0B683C8E60 ?
qsmmixGenFakeIdxSta  call     qsmmixComputeCluste  2B0B68419928 ? 7FFF758CEFF8 ?
ts()+1025                     ringFactor()         2B0B66D95590 ?
                                                   BFF0000000000000 ?
                                                   000000000 ? 2B0B683C8E60 ?
qsmmixPopulateIdxSt  call     qsmmixGenFakeIdxSta  2B0B66AD2430 ? 2B0B68419928 ?
ats()+71                      ts()                 2B0B684197A0 ? 2B0B66B92630 ?
                                                   40BAD30000000000 ?
                                                   2B0B683C8E60 ?
qsmmixSetKkotbixt()  call     qsmmixPopulateIdxSt  2B0B66B92630 ? 2B0B66AD2430 ?
+479                          ats()                2B0B68419928 ? 2B0B66B92630 ?
                                                   40BAD30000000000 ?
                                                   2B0B683C8E60 ?
qsmmixReturnCandToO  call     qsmmixSetKkotbixt()  2B0B66B92630 ? 2B0B66AD2430 ?
pt()+656                                           2B0B683F0090 ? 2B0B6845DCC8 ?
                                                   2B0B6845C2B8 ? 2B0B68462F08 ?
qsmmixOptimizerGenI  call     qsmmixReturnCandToO  2B0B66B92630 ? 2B0B66AD2430 ?
dxCand()+601                  pt()                 2B0B68462F60 ? 2B0B6845DCC8 ?
                                                   2B0B6845C2B8 ? 2B0B68462F08 ?
qsmmixOptimizerSetu  call     qsmmixOptimizerGenI  0913F73A0 ? 2B0B6845C2B8 ?
pIdxCand()+351                dxCand()             000000009 ? 000000001 ?
                                                   2B0B6845C2B8 ? 2B0B68462F08 ?
kkoiqb()+13730       call     qsmmixOptimizerSetu  0913F73A0 ? 2B0B683EEFD8 ?
                              pIdxCand()           000000009 ? 000000001 ?
                                                   2B0B6845C2B8 ? 2B0B68462F08 ?
kkooqb()+632         call     kkoiqb()             2B0B66B6FF88 ? 000000000 ?
                                                   000000000 ? 000000001 ?
                                                   2B0B6845C2B8 ? 2B0B68462F08 ?
kkoqbc()+2359        call     kkooqb()             2B0B66B6FF88 ? 000000000 ?
                                                   000000000 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apakkoqb()+166       call     kkoqbc()             7FFF758D2DF0 ? 2B0B66B6FF88 ?
                                                   000000000 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apaqbdDescendents()  call     apakkoqb()           7FFF758D2DF0 ? 2B0B66B6FF88 ?
+457                                               0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apaqbdList()+71      call     apaqbdDescendents()  7FFF758D2DF0 ? 2B0B66B6FF88 ?
                                                   0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apaqbdDescendents()  call     apaqbdList()         7FFF758D2DF0 ? 2B0B66B6FF88 ?
+710                                               0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apaqbdList()+71      call     apaqbdDescendents()  0913F73A0 ? 2B0B66C975C8 ?
                                                   0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apaqbd()+9           call     apaqbdList()         0913F73A0 ? 2B0B66C975C8 ?
                                                   0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apadrv()+861         call     apaqbd()             0913F73A0 ? 2B0B66C975C8 ?
                                                   0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
opitca()+1971        call     apadrv()             0913F73A0 ? 2B0B66C975C8 ?
                                                   0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
kksSetBindType()+76  call     opitca()             2B0B671A77F8 ? 0913F73A0 ?
05                                                 7FFF758D4D10 ? 000000004 ?
                                                   100000000 ? 2B0B00000000 ?
kksfbc()+10664       call     kksSetBindType()     7FFF758D4D10 ? 2B0B671A77F8 ?
                                                   7FFF758D4DA0 ? 000000102 ?
                                                   7FFF0000001F ? 000000000 ?
opiexe()+2268        call     kksfbc()             2B0B671A77F8 ? 000000003 ?
                                                   000000102 ? 000000000 ?
                                                   000000000 ? 7FFF758D5D70 ?
kpoal8()+2226        call     opiexe()             000000049 ? 000000003 ?
                                                   7FFF758D6310 ? 000000000 ?
                                                   000000000 ? 7FFF758D5D70 ?
opiodr()+910         call     kpoal8()             00000005E ? 000000000 ?
                                                   7FFF758DA288 ? 000000000 ?
                                                   000000000 ? 000000001 ?
kpoodrc()+31         call     opiodr()             00000005E ? 000000000 ?
                                                   7FFF758DA288 ? 000000000 ?
                                                   00989C970 ? 000000001 ?
rpiswu2()+1618       call     kpoodrc()            7FFF758D75C0 ? 000000000 ?
                                                   7FFF758DA288 ? 000000000 ?
                                                   00989C970 ? 000000001 ?
kpoodr()+617         call     rpiswu2()            0D3C52C80 ? 000000023 ?
                                                   2B0B65EDFAAC ? 000000004 ?
                                                   2B0B6C6B7340 ? 000000023 ?
upirtrc()+2417       call     kpoodr()             2B0B65EBF170 ? 00000005E ?
                                                   7FFF758DA288 ? 000000000 ?
                                                   2B0B6C6B7340 ? 000000023 ?
kpurcsc()+93         call     upirtrc()            2B0B65EBF170 ? 00000005E ?
                                                   7FFF758DA288 ? 7FFF758DA400 ?
                                                   7FFF758DB420 ? 2B0B65C7D308 ?
kpuexec()+10804      call     kpurcsc()            2B0B65EBF170 ? 00000005E ?
                                                   7FFF758DA288 ? 7FFF758DA400 ?
                                                   7FFF758DB420 ? 2B0B65C7D308 ?
OCIStmtExecute()+34  call     kpuexec()            2B0B65E3E5B8 ? 2B0B65E3E538 ?
                                                   7FFF758DA280 ? 000000000 ?
                                                   E0C28C4F00000000 ?
                                                   000000000 ?
qksanExecSql()+743   call     OCIStmtExecute()     2B0B65E3E5B8 ? 2B0B65E3E538 ?
                                                   7FFF758DA280 ? 000000000 ?
                                                   E0C28C4F00000000 ?
                                                   000000000 ?
qksanAnalyzeSql()+2  call     qksanExecSql()       7FFF758DDCA8 ? 2B0B65C7D308 ?
363                                                2B0B65C7D3E8 ? 2B0B65E3E538 ?
                                                   7FFF758DDC28 ? 7FFF758DE3A9 ?
qsmmixProcessQuery(  call     qksanAnalyzeSql()    7FFF758DDCA8 ? 2B0B67E90DA8 ?
)+1089                                             00000B3AB ? 2B0B65FBD218 ?
                                                   100000023 ? 7FFF758DE3A9 ?
qsmmixSqlTuneAnalyz  call     qsmmixProcessQuery(  2B0B65FBD218 ? 000000005 ?
eIdx()+449                    )                    2B0B67E90DA8 ? 2B0B65F37BF4 ?
                                                   2B0B65FBD218 ? 7FFF00000023 ?
kestsiIndexAnalyzeD  call     qsmmixSqlTuneAnalyz  7FFF758DE370 ? 000000005 ?
rv()+794                      eIdx()               2B0B67E90DA8 ? 2B0B65F37BF4 ?
                                                   2B0B65FBD218 ? 7FFF00000023 ?
kestsTuneSqlDrv()+3  call     kestsiIndexAnalyzeD  7FFF758E0548 ? 7FFF758E08C0 ?
83                            rv()                 7FFF758DFC78 ? 7FFF758E09B0 ?
                                                   7FFF758DED50 ? 7FFF758DED58 ?
kesaiExecAction()+9  call     kestsTuneSqlDrv()    7FFF758E0540 ? 7FFF758E08C0 ?
81                                                 7FFF758DFC78 ? 7FFF758E09B0 ?
                                                   7FFF758DED50 ? 7FFF758DED58 ?
kesaiTuneSqlDrv()+6  call     kesaiExecAction()    7FFF758DF420 ? 7FFF758E08C0 ?
258                                                7FFF758E09B0 ? 7FFF758E05E0 ?
                                                   7FFF758E0A68 ? 7FFF758E0A60 ?
spefcifa()+225       call     kesaiTuneSqlDrv()    7FFF758E21B8 ? 2B0B65CA3180 ?
                                                   000000000 ? 7FFF758E05E0 ?
                                                   0C5FB0033 ? 2B0B65C362B8 ?
spefmccallstd()+421  call     spefcifa()           7FFF758E1CE0 ? 000000004 ?
                                                   2B0B65CA3108 ? 7FFF758E0F30 ?
                                                   0C5FB0033 ? 2B0B65C362B8 ?
pextproc()+36        call     spefmccallstd()      7FFF758E20E0 ? 7FFF758E1D88 ?
                                                   7FFF758E1AA0 ? 7FFF758E1CE0 ?
                                                   000000000 ? 2B0B65C362B8 ?
__PGOSF589_peftrust  call     pextproc()           7FFF758E20E0 ? 7FFF758E1D88 ?
ed()+145                                           7FFF758E1AA0 ? 7FFF758E1CE0 ?
                                                   000000000 ? 2B0B65C362B8 ?
__PGOSF633_psdexsp(  call     __PGOSF589_peftrust  7FFF758E20E0 ? 7FFF758E1D88 ?
)+255                         ed()                 7FFF758E1AA0 ? 7FFF758E1CE0 ?
                                                   000000000 ? 2B0B65C362B8 ?
rpiswu2()+1618       call     __PGOSF633_psdexsp(  7FFF758E18D0 ? 7FFF758E20E0 ?
                              )                    7FFF758E1AA0 ? 000020003 ?
                                                   0037500E0 ? 7FFF758E3100 ?
psdextp()+695        call     rpiswu2()            0D3C52C80 ? 000000000 ?
                                                   7FFF758E1910 ? 000000002 ?
                                                   7FFF758E1950 ? 000000000 ?
pefccal()+726        call     psdextp()            7FFF758E3100 ? 7FFF758E1D88 ?
                                                   7FFF758E1AA0 ? 000000000 ?
                                                   000020003 ? 7FFF758E20E0 ?
pefcal()+219         call     pefccal()            7FFF758E20E0 ? 000A324C2 ?
                                                   00B7C8EA0 ? 000000000 ?
                                                   000020003 ? 7FFF758E20E0 ?
pevm_FCAL()+164      call     pefcal()             7FFF758E20E0 ? 00B7C9050 ?
                                                   2B0B65C835B8 ? 000000000 ?
                                                   000020003 ? 7FFF758E20E0 ?
pfrinstr_FCAL()+70   call     pevm_FCAL()          2B0B65C835B8 ? 0A6FB80D8 ?
                                                   2B0B65C835B8 ? 0A6FB8030 ?
                                                   000020003 ? 7FFF758E20E0 ?
pfrrun_no_tool()+63  call     pfrinstr_FCAL()      2B0B65C835B8 ? 0A5405930 ?
                                                   2B0B65C83628 ? 0A6FB8030 ?
                                                   000020003 ? 7FFF758E20E0 ?
pfrrun()+622         call     pfrrun_no_tool()     2B0B65C835B8 ? 0A5405930 ?
                                                   2B0B65C83628 ? 0A6FB8030 ?
                                                   000020003 ? 7FFF758E20E0 ?
plsql_run()+644      call     pfrrun()             2B0B65C835B8 ? 000000000 ?
                                                   2B0B65C83628 ? 7FFF758E3100 ?
                                                   000020003 ? 0928C7098 ?
peicnt()+296         call     plsql_run()          2B0B65C835B8 ? 000000001 ?
                                                   000000000 ? 7FFF758E3100 ?
                                                   000020003 ? 000000000 ?
kkxexe()+521         call     peicnt()             7FFF758E3100 ? 2B0B65C835B8 ?
                                                   2B0B65CA4FD8 ? 7FFF758E3100 ?
                                                   2B0B65CA2F30 ? 000000000 ?
opiexe()+17478       call     kkxexe()             2B0B65C8BD58 ? 2B0B65C835B8 ?
                                                   000000000 ? 7FFF758E3100 ?
                                                   2B0B65CA2F30 ? 000000000 ?
kpoal8()+2226        call     opiexe()             000000049 ? 000000003 ?
                                                   7FFF758E4730 ? 7FFF758E3100 ?
                                                   2B0B65CA2F30 ? 000000000 ?
opiodr()+910         call     kpoal8()             00000005E ? 000000000 ?
                                                   7FFF758E81A8 ? 7FFF758E3100 ?
                                                   2B0B65CA2F30 ? 7FFF00000001 ?
kpoodr()+648         call     opiodr()             00000005E ? 000000000 ?
                                                   7FFF758E81A8 ? 000000000 ?
                                                   00989C970 ? 7FFF00000001 ?
upirtrc()+2417       call     kpoodr()             2B0B65C796A8 ? 00000005E ?
                                                   7FFF758E81A8 ? 000000000 ?
                                                   00989C970 ? 7FFF00000001 ?
kpurcsc()+93         call     upirtrc()            2B0B65C796A8 ? 00000005E ?
                                                   7FFF758E81A8 ? 7FFF758E8320 ?
                                                   7FFF758E9340 ? 2B0B65C7D308 ?
kpuexec()+11692      call     kpurcsc()            2B0B65C796A8 ? 00000005E ?
                                                   7FFF758E81A8 ? 7FFF758E8320 ?
                                                   7FFF758E9340 ? 2B0B65C7D308 ?
OCIStmtExecute()+34  call     kpuexec()            2B0B65C88290 ? 2B0B65C88210 ?
                                                   7FFF758E81A0 ? 000000001 ?
                                                   7FFF00000000 ? 000000000 ?
__PGOSF529_jslvec_e  call     OCIStmtExecute()     2B0B65C88290 ? 2B0B65C88210 ?
xeccb()+2207                                       7FFF758E81A0 ? 000000001 ?
                                                   7FFF00000000 ? 000000000 ?
jslvswu()+54         call     __PGOSF529_jslvec_e  7FFF758EC39C ? 2B0B65C88210 ?
                              xeccb()              2B0B65C88210 ? 000000001 ?
                                                   7FFF00000000 ? 000000000 ?
jslve_execute0()+22  call     jslvswu()            000000000 ? 7FFF00000000 ?
17                                                 000000000 ? 000000001 ?
                                                   7FFF00000000 ? 000000000 ?
jslve_execute()+327  call     jslve_execute0()     7FFF758EE2B4 ? 000005946 ?
                                                   000000002 ? 7FFF758EE2A0 ?
                                                   000000000 ? 0FFFFFFFF ?
rpiswu2()+1618       call     jslve_execute()      7FFF758EE150 ? 000000002 ?
                                                   7FFF758EE2B4 ? 000005946 ?
                                                   7FFF758EE2A0 ? 0FFFFFFFF ?
kkjex1e()+374        call     rpiswu2()            0D3C52C80 ? 000000000 ?
                                                   7FFF758EE170 ? 000000002 ?
                                                   7FFF758EE190 ? 000000000 ?
kkjsexe()+705        call     kkjex1e()            7FFF758EE2B4 ? 000005946 ?
                                                   000000002 ? 7FFF758EE2A0 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
kkjrdp()+689         call     kkjsexe()            7FFF758EE2B4 ? 000005946 ?
                                                   000000001 ? 7FFF758EE2A0 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
opirip()+953         call     kkjrdp()             7FFF758EE2B4 ? 000005946 ?
                                                   000000001 ? 7FFF758EE2A0 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
opidrv()+598         call     opirip()             000000032 ? 000000004 ?
                                                   7FFF758EFA28 ? 7FFF758EE2A0 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
sou2o()+98           call     opidrv()             000000032 ? 000000004 ?
                                                   7FFF758EFA28 ? 7FFF758EE2A0 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
opimai_real()+261    call     sou2o()              7FFF758EFA00 ? 000000032 ?
                                                   000000004 ? 7FFF758EFA28 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
ssthrdmain()+252     call     opimai_real()        000000000 ? 7FFF758EFBF0 ?
                                                   000000004 ? 7FFF758EFA28 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
main()+196           call     ssthrdmain()         000000003 ? 7FFF758EFBF0 ?
                                                   000000001 ? 000000000 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
__libc_start_main()  call     main()               000000003 ? 7FFF758EFD90 ?
+244                                               000000001 ? 000000000 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
_start()+36          call     __libc_start_main()  000A077C8 ? 000000001 ?
                                                   7FFF758EFD88 ? 000000000 ?
                                                   0D57A5F08 ? 000000003 ?
--------------------- Binary Stack Dump ---------------------

分析Stack可以发现,他们和MOS[]中的非常类此,可以断定是该Bug 9746210
skdstdst <- ksedst1 <- ksedst <- dbkedDefDump <- ksedmp<- ssexhd <- sighandler <- qsmmixComputeClusteringFactor <- qsmmixGenFakeIdxStats <- qsmmixPopulateIdxStats <- qsmmixSetKkotbixtqsmmixReturnCandToOpt <- qsmmixOptimizerGenIdxCand <- qsmmixOptimizerSetupIdxCandResponse
处理建议

The is fixed in following versions:
 12.1 (Future Release)
 11.2.0.3 (Server Patch Set)
 11.2.0.2.4 Patch Set Update
 11.2.0.2 Bundle Patch 12 for Exadata Database
 11.2.0.2 Patch 11 on Windows Platforms
To resolve the issue:
Either
Upgrade to the over versions
Or
Apply Patch 9746210

补充说明
对于SQL TUNING功能,在觉得多少人的环境中都不需要这个,可以考虑禁用该功能来屏蔽该错误

BEGIN
dbms_auto_task_admin.disable(
    client_name => 'sql tuning advisor',
    operation   => NULL,
    window_name => NULL);
END;

参考文档
SQL Tuning query fails with ORA-7445 [qsmmixComputeClusteringFactor] [ID 1483654.1]
Error ORA-07445 Qsmmixcomputeclusteringfactor From SQL Tuning [ID 1359148.1]
Bug 9746210 – ORA-7445 [qsmmixComputeClusteringFactor] from SQL tuning [ID 9746210.8]