Database SOS – 第111页 – Professional Database Recovery Support@+8617813235971

操作系统级别做systemstate

在有些时候，我们需要通过systemstate或者hanganalyze来收集数据库hang信息，但是当我们的数据库不能登录的时候，无法通过常规手段来使用该方法来收集信息。ORACLE的开发者也考虑到了类似情况，让我们可以通过dbx或者gdb来收集数据库信息.这里我通过在linux平台上模拟一个新会话，然后通过gdb直接对该会话进行做systemstate,然后操作该会话(证明gdb操作后正常),来实现类似系统hang住，无法登陆的时候怎么做systemstate.
模拟会话
就是假设我们hang住的系统中的数据库进程(为了后面gdb和验证对会话影响而使用,不然可以直接使用oracle 后台进程来完成该操作)

[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Thu Jan 17 18:28:30 2013
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL>

查找数据库进程
在另外一个shell窗口找出数据库进程

[root@xifenfei trace]# ps -ef|grep LOCAL
oracle    7476  7473  0 18:28 ?        00:00:00 oracleora11g (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
root      7487  7326  0 18:28 pts/2    00:00:00 grep LOCAL

gdb做systemstate

[oracle@xifenfei trace]$ gdb $ORACLE_HOME/bin/oracle 7476
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-37.el5)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
…………
Reading symbols from /u01/oracle/oracle/product/11.2.0/db_1/lib/libnque11.so...
Loaded symbols for /u01/oracle/oracle/product/11.2.0/db_1/lib/libnque11.so
0xb789c424 in __kernel_vsyscall ()
(gdb)  print ksudss(10)  <--输入print ksudss(10)
$1 = 0
(gdb) quit
A debugging session is active.
        Inferior 1 [process 7476] will be detached.
Quit anyway? (y or n) y
Detaching from program: /u01/oracle/oracle/product/11.2.0/db_1/bin/oracle, process 7476

查看trace文件

[oracle@xifenfei trace]$ ls -l *7476*.trc
-rw-r----- 1 oracle oinstall 742438 Jan 17 18:28 ora11g_ora_7476.trc
[oracle@xifenfei trace]$ more ora11g_ora_7476.trc
Trace file /u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_7476.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/oracle/oracle/product/11.2.0/db_1
System name:    Linux
Node name:      xifenfei
Release:        2.6.32-200.13.1.el5uek
Version:        #1 SMP Wed Jul 27 20:21:26 EDT 2011
Machine:        i686
VM name:        VMWare Version: 6
Instance name: ora11g
Redo thread mounted by this instance: 1
Oracle process number: 21
Unix process pid: 7476, image: oracle@xifenfei (TNS V1-V3)
*** 2013-01-17 18:28:59.225
*** SESSION ID:(143.23) 2013-01-17 18:28:59.225
*** CLIENT ID:() 2013-01-17 18:28:59.225
*** SERVICE NAME:(SYS$USERS) 2013-01-17 18:28:59.225
*** MODULE NAME:(sqlplus@xifenfei (TNS V1-V3)) 2013-01-17 18:28:59.225
*** ACTION NAME:() 2013-01-17 18:28:59.225
===================================================
SYSTEM STATE (level=10)  <---systemstate levle 10
------------
System global information:
     processes: base 0x32bc5b38, size 150, cleanup 0x32bd5990
     allocation: free sessions 0x32085b84, free calls (nil)
     control alloc errors: 0 (process), 0 (session), 0 (call)
     PMON latch cleanup depth: 0
     seconds since PMON's last scan for dead processes: 693
     system statistics:

这里证明使用gdb–>print ksudss(10)对数据库做的是systemstate level 10

验证做gdb的进程

SQL> select * from dual;
D
-
X
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

证明被gdb处理过的进程还是正常(未收到额外破坏),所以我们可以确定在系统hang住，而且新会话不能正常建立连接的时候，我们可以尝试着使用ksudss(10)来对系统做systemstate然后分析

aix中procmap 查看oracle进程占用系统内存

procmap是用来显示进程地址空间，通过这个命令找出来的“read/write”表示为进程的私有内存,如果对应到oracle 进程的LOCAL中来,也就是对应了是oracle 会话进程占用的操作系统内存,和sga与pga无关，即ORACLE数据库进程占用的额外的系统内存,在计算oracle数据库消耗内存的时候，要考虑sga+pga+process占用的内存
procmap命令使用

$procmap 7931354
7931354 : oracleccicdx (LOCAL=NO)
100000000            95504K  read/exec         oracle
110000035             2399K  read/write        oracle
9fffffff0000000         51K  read/exec         /usr/ccs/bin/usla64
9fffffff000cfe2          0K  read/write        /usr/ccs/bin/usla64
900000000b05930          2K  read/exec         /usr/lib/libC.a[shr3_64.o]
9001000a0122930          0K  read/write        /usr/lib/libC.a[shr3_64.o]
900000000ae6b00        118K  read/exec         /usr/lib/libC.a[shrcore_64.o]
9001000a030a100         12K  read/write        /usr/lib/libC.a[shrcore_64.o]
900000000ac8000        118K  read/exec         /usr/lib/libC.a[ansicore_64.o]
9001000a0300e00         36K  read/write        /usr/lib/libC.a[ansicore_64.o]
900000000411468          0K  read/exec         /usr/lib/libicudata.a[shr_64.o]
9001000a0121468          0K  read/write        /usr/lib/libicudata.a[shr_64.o]
90000000040f738          2K  read/exec         /usr/lib/libC.a[shr2_64.o]
9001000a0314738          0K  read/write        /usr/lib/libC.a[shr2_64.o]
9000000008dd800       1699K  read/exec         /usr/lib/libC.a[ansi_64.o]
9001000a0315a00        277K  read/write        /usr/lib/libC.a[ansi_64.o]
9000000008bab00        135K  read/exec         /usr/lib/libC.a[shr_64.o]
9001000a030eb00         19K  read/write        /usr/lib/libC.a[shr_64.o]
900000000708180       1732K  read/exec         /usr/lib/libicuuc.a[shr_64.o]
9001000a035cdac        180K  read/write        /usr/lib/libicuuc.a[shr_64.o]
900000000493d80       2510K  read/exec         /usr/lib/libicui18n.a[shr_64.o]
9001000a038a148        270K  read/write        /usr/lib/libicui18n.a[shr_64.o]
900000000473200         91K  read/exec         /usr/lib/libsrc.a[shr_64.o]
9001000a01127a8         55K  read/write        /usr/lib/libsrc.a[shr_64.o]
90000000045a300         98K  read/exec         /usr/lib/libcorcfg.a[shr_64.o]
9001000a04147c8         18K  read/write        /usr/lib/libcorcfg.a[shr_64.o]
900000000b16200        750K  read/exec         /usr/lib/liblvm.a[shr_64.o]
9001000a03dd028        219K  read/write        /usr/lib/liblvm.a[shr_64.o]
900000000444f00         82K  read/exec         /usr/lib/libcfg.a[shr_64.o]
9001000a03d58f0         26K  read/write        /usr/lib/libcfg.a[shr_64.o]
90000000040e3a0          2K  read/exec         /usr/lib/libcrypt.a[shr_64.o]
9001000a0106948          0K  read/write        /usr/lib/libcrypt.a[shr_64.o]
90000001615d860          5K  read/exec         /usr/lib/libc.a[aio_64.o]
9001000a3aed568          0K  read/write        /usr/lib/libc.a[aio_64.o]
9000000003efc00        120K  read/exec         /usr/lib/libodm.a[shr_64.o]
9001000a0107cc8         40K  read/write        /usr/lib/libodm.a[shr_64.o]
900000000bd2c80        147K  read/exec         /usr/lib/libperfstat.a[shr_64.o]
9001000a041a960         14K  read/write        /usr/lib/libperfstat.a[shr_64.o]
9000000017d7000          0K  read/exec         /usr/lib/libdl.a[shr_64.o]
9001000a0517000          0K  read/write        /usr/lib/libdl.a[shr_64.o]
9000000158ed100       8636K  read/exec         /oracle/product/db10gr2/lib/libjox10.a[shr.o]
8001000a0000b78        587K  read/write        /oracle/product/db10gr2/lib/libjox10.a[shr.o]
900000000a87000        257K  read/exec         /usr/lib/libpthreads.a[shr_xpg5_64.o]
9001000a0274000        559K  read/write        /usr/lib/libpthreads.a[shr_xpg5_64.o]
900000000000800       4025K  read/exec         /usr/lib/libc.a[shr_64.o]
9001000a0000020       1047K  read/write        /usr/lib/libc.a[shr_64.o]
         Total      121863K

简化命令，统计私有内存,procmap 7931354|grep “read/write” |awk -F ” ” ‘{print $2}’,通过相关计算的出来,在当前的操作系统和数据库版本中，一个LOCAL=NO进程占用系统内存为:5758KB

补充说明
1.操作系统版本

$oslevel -r
6100-06

2.数据库版本

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Productio
NLSRTL Version 10.2.0.4.0 - Production

3.通过跟踪多个LOCAL=NO进程,发现类似进程占用的系统内存相同,估算给系统oracle进程占用的内存，可以通过该值进行大概估算
4.确认ORACLE使用的内存量不是以往认识的sga+pga，实际上应该是sga+pga+所有oracle进程占用
5.在linux中使用pmap来查看

设置_smu_debug_mode实现指定session级别使用特定回滚段

通过设置_smu_debug_mode值来实现指定session级别使用特定的回滚段
_smu_debug_mode为默认值

--测试数据库版本
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Solaris: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
--_smu_debug_mode值
SQL> select a.ksppinm name,b.ksppstvl value
  2    from x$ksppi a,x$ksppcv b
  3   where a.inst_id = USERENV ('Instance')
  4     and b.inst_id = USERENV ('Instance')
  5     and a.indx = b.indx
  6     and upper(a.ksppinm) LIKE upper('%&param%')
  7  order by name;
Enter value for param: _smu_debug_mode
old   6:    and upper(a.ksppinm) LIKE upper('%&param%')
new   6:    and upper(a.ksppinm) LIKE upper('%_smu_debug_mode%')
NAME                             VALUE
-------------------------------- ------------------------
_smu_debug_mode                  0
--undo管理模式
SQL> show parameter undo;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_management                      string      AUTO
undo_retention                       integer     900
undo_tablespace                      string      UNDOTBS1
--指定回滚段(查询dba_rollback_segs得到回滚段名称)
SQL>  set transaction use rollback segment "_SYSSMU7_1887299474$";
Transaction set.
SQL> delete from t where rownum<10;
9 rows deleted.
--查询使用回滚段
SQL> select XIDUSN from V$TRANSACTION;
    XIDUSN
----------
         9

这里可以看到在undo自动管理模式下,我们手工指定了回滚段但是被数据库给忽略,还是使用了系统自动分配的回滚段。例如这里我指定的回滚段7,但是使用了系统自动分配的回滚段9

_smu_debug_mode=45

SQL> select * from v$version;
BANNER
-------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for 32-bit Windows: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> show parameter undo;
NAME                                 TYPE                VALUE
------------------------------------ -------    -----------------
undo_management                      string       AUTO
undo_retention                       integer      900
undo_tablespace                      string       undo_new
SQL> select a.ksppinm name,b.ksppstvl value
  2    from x$ksppi a,x$ksppcv b
  3   where a.inst_id = USERENV ('Instance')
  4     and b.inst_id = USERENV ('Instance')
  5     and a.indx = b.indx
  6     and upper(a.ksppinm) LIKE upper('%&param%')
  7  order by name;
Enter value for param: _smu_debug_mode
old   6:    and upper(a.ksppinm) LIKE upper('%&param%')
new   6:    and upper(a.ksppinm) LIKE upper('%_smu_debug_mode%')
NAME                             VALUE
-------------------------------- ------------------------
_smu_debug_mode                  45
/*
使用alter system set "_smu_debug_mode" = 45;配置
注意:该参数只能在system级别配置
*/
--测试表
SQL> create table t_xifenfei
  2  as
  3  select * from dba_objects;
Table created.
--指定回滚段
SQL> set transaction use rollback segment "_SYSSMU15_1680736333$";
Transaction set.
SQL> delete from t_xifenfei where rownum<10;
9 rows deleted.
--查询事务回滚段
SQL> select XIDUSN from V$TRANSACTION;
    XIDUSN
----------
        15
SQL> commit;
Commit complete.
--再次指定回滚段
SQL> set transaction use rollback segment "_SYSSMU17_527554872$";
Transaction set.
SQL> delete from t_xifenfei where rownum<10;
9 rows deleted.
--查询事务回滚段
SQL> select XIDUSN from V$TRANSACTION;
    XIDUSN
----------
        17

这里可以看出来通过设置”_smu_debug_mode” = 45可以很好的实现在undo自动管理模式下,指定事务在特定的回滚段,在某些极限情况下,可以通过该操作来减少回滚段争用.

asmlib异常报ORA-00600[kfklLibFetchNext00]

一个朋友的历史库出现故障,在linux 4的平台上asm的10.2.0.1的单库，asm使用asmlib来处理。
asm不能正常mount磁盘组，可以看到asmdisk，alert日志报ORA-00600[kfklLibFetchNext00]
操作系统内核是：2.6.9-78
oracleasmlib是：2.0.2-1
asm磁盘组mount失败

--以前故障
SQL> ALTER DISKGROUP ALL MOUNT
Thu Sep  6 14:23:16 2012
NOTE: cache registered group DGARC number=1 incarn=0x2bf96274
NOTE: cache registered group DGDATA number=2 incarn=0x2c196275
NOTE: cache registered group DGSYS number=3 incarn=0x2c196276
Thu Sep  6 14:23:16 2012
Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_10204.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
Thu Sep  6 14:23:16 2012
Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_10204.trc:
ORA-15183: ASMLIB initialization error [/opt/oracle/extapi/64/asm/orcl/1/libasm.so]
ORA-15183: ASMLIB initialization error [driver/agent not installed]
Thu Sep  6 14:23:16 2012
ERROR: no PST quorum in group 1: required 2, found 0
Thu Sep  6 14:23:16 2012
NOTE: cache dismounting group 1/0x2BF96274 (DGARC)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGARC was not mounted
Thu Sep  6 14:23:16 2012
ERROR: no PST quorum in group 2: required 2, found 0
Thu Sep  6 14:23:16 2012
NOTE: cache dismounting group 2/0x2C196275 (DGDATA)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGDATA was not mounted
Thu Sep  6 14:23:16 2012
ERROR: no PST quorum in group 3: required 2, found 0
Thu Sep  6 14:23:16 2012
NOTE: cache dismounting group 3/0x2C196276 (DGSYS)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGSYS was not mounted
--现在故障
Thu Jan 24 13:49:45 2013
SQL> ALTER DISKGROUP ALL MOUNT
Thu Jan 24 13:49:45 2013
NOTE: cache registered group DGARC number=1 incarn=0xf388cee9
NOTE: cache registered group DGDATA number=2 incarn=0xf3a8ceea
NOTE: cache registered group DGSYS number=3 incarn=0xf3a8ceeb
Thu Jan 24 13:49:45 2013
Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_13449.trc:
ORA-00600: internal error code, arguments: [kfklLibFetchNext00],
[18446744073709551614], [0], [], [], [], [], []
Thu Jan 24 13:49:46 2013
Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_13449.trc:
ORA-00600: internal error code, arguments: [kfklLibFetchNext00],
[18446744073709551614], [0], [], [], [], [], []
Thu Jan 24 13:49:46 2013
ERROR: no PST quorum in group 1: required 2, found 0
Thu Jan 24 13:49:46 2013
NOTE: cache dismounting group 1/0xF388CEE9 (DGARC)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGARC was not mounted
Thu Jan 24 13:49:46 2013
ERROR: no PST quorum in group 2: required 2, found 0
Thu Jan 24 13:49:46 2013
NOTE: cache dismounting group 2/0xF3A8CEEA (DGDATA)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGDATA was not mounted
Thu Jan 24 13:49:46 2013
ERROR: no PST quorum in group 3: required 2, found 0
Thu Jan 24 13:49:46 2013
NOTE: cache dismounting group 3/0xF3A8CEEB (DGSYS)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DGSYS was not mounted
Shutting down instance: further logons disabled

trace文件信息

----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+31          call     ksedst1()            000000000 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
ksedmp()+610         call     ksedst()             000000000 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
ksfdmp()+21          call     ksedmp()             000000003 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
kgerinv()+161        call     ksfdmp()             000000003 ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000001 ?
kgesinv()+33         call     kgerinv()            006469D40 ? 0064E1C58 ?
                                                   000000000 ? 000000000 ?
                                                   000000001 ? 000000001 ?
kgesinw()+166        call     kgesinv()            006469D40 ? 0064E1C58 ?
                                                   000000000 ? 000000000 ?
                                                   000000001 ? 000000001 ?
kfklLibScanNext()+2  call     kgesinw()            006469D40 ? 000000000 ?
39                                                 000000001 ? 000000000 ?
                                                   FFFFFFFFFFFFFFFE ?
                                                   000000000 ?
kfkLibFetchNext()+3  call     kfklLibScanNext()    0064DDD70 ? 7FBFFFDCD0 ?
43                                                 000000001 ? 000000000 ?
                                                   FFFFFFFFFFFFFFFE ?
                                                   000000000 ?
kfuitrnInit()+524    call     kfkLibFetchNext()    006469D40 ? 2A971DFF90 ?
                                                   000000001 ? 000000000 ?
                                                   FFFFFFFFFFFFFFFE ?
                                                   000000000 ?
kfkLibIterInit()+18  call     kfuitrnInit()        006469D40 ? 2A971DFCB0 ?
0                                                  2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
kfkLoadAllLibs()+36  call     kfkLibIterInit()     000000000 ? 00646C7E0 ?
3                                                  2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
kfkDiscoverString()  call     kfkLoadAllLibs()     000000000 ? 00646C7E0 ?
+107                                               2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
Cannot find symbol
Cannot find symbol
Cannot find symbol
kfdDiscoverString()  call     kfkDiscoverString()  067A53768 ? 00646C7E0 ?
+28                                                2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
kfdDiscoverShallow(  call     kfdDiscoverString()  067A53768 ? 000000000 ?
)+315                                              2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
kfgbDriver()+1174    call     kfdDiscoverShallow(  000000180 ? 000000000 ?
                              )                    2A971DFF90 ? 000000009 ?
                                                   000000009 ? 000000000 ?
ksbabs()+564         call     kfgbDriver()         7FBFFFE5C0 ? 000000048 ?
                                                   000000000 ? 000000009 ?
                                                   000000009 ? 000000000 ?
ksbrdp()+727         call     ksbabs()             7FBFFFE5C0 ? 000000048 ?
                                                   000000000 ? 000000009 ?
                                                   000000009 ? 000000000 ?
opirip()+616         call     ksbrdp()             7FBFFFE5C0 ? 000000048 ?
                                                   000000001 ? 06002C770 ?
                                                   000000009 ? 000000000 ?
opidrv()+582         call     opirip()             000000032 ? 000000004 ?
                                                   7FBFFFF6C8 ? 06002C770 ?
                                                   000000009 ? 000000000 ?
sou2o()+114          call     opidrv()             000000032 ? 000000004 ?
                                                   7FBFFFF6C8 ? 06002C770 ?
                                                   000000009 ? 000000000 ?
opimai_real()+317    call     sou2o()              7FBFFFF6A0 ? 000000032 ?
                                                   000000004 ? 7FBFFFF6C8 ?
                                                   000000009 ? 000000000 ?
main()+116           call     opimai_real()        000000003 ? 7FBFFFF730 ?
                                                   000000004 ? 7FBFFFF6C8 ?
                                                   000000009 ? 000000000 ?
<0x3c9fb1c40b>       call     main()               000000003 ? 7FBFFFF730 ?
                                                   000000004 ? 7FBFFFF6C8 ?
                                                   000000009 ? 000000000 ?
--------------------- Binary Stack Dump ---------------------

因为客户的库是一个历史库，基本上不怎么使用，在2012年启动asm就出现了ORA-15183错误，然后在2013年重启机器后，再次启动asm就出现了ORA-00600[kfklLibFetchNext00]错误，通过2012年的错误提示，我们大概可以判断出来该问题和ASMLIB有关系,查询mos发现429945.1,发现Call Stack Trace完全一致，可以定位是该问题(如果想深入分析，可以通过strace继续分析)

ORA-600: [kfklLibFetchNext00], [18446744073709551614], [0] when mounting diskgroup in ASM

Applies to:
Linux OS - Version: 2.0.1-1 and later   [Release: RHEL4 and later ]
Information in this document applies to any platform.
Linux Kernel - Version: 2.0.1
Symptoms
 3 RAC db.
2 nodes are up and functioning except for 1 node - ASM did not come back up after
the reboot eventhough all disks show available from asmlib's perspective:
Changes
 All that was done with resources were stopped on Node1 and an extra LUN added.
 A reboot was then performed.
Cause
 The cause of the issue is libasm.o corruption
Ran the following to confirm that disks are ok:
/dev/oracleasm listdisks
/usr/sbin/asmtool -I -l /dev/oracleasm -n /dev/sdg1 -a label
/usr/sbin/oracleasm-discover 'ORCL:*'
dd if=/dev/sdg1 bs=8192 count=1 | od -c
==> output checked out fine
.
kfod asm_diskstring='ORCL:*'
==> this failed on Node1
KFOD-00600: file not found; argument [610][kfklLibFetchNext00] even though libasm.o exists
You might see the following call stack as well
----- Call Stack Trace -----
kfklLibScanNext
kfkLibFetchNext
kfuitrnInit
kfkLibIterInit
kfkLoadAllLibs
kfkDiscoverString
kfdDiscoverString
kfdDiscoverShallow
kfgbDriver
strace showed
 Node1-failing
-------
stat("/opt/oracle/extapi/64/asm/orcl/1/libasm.so", {st_mode=S_IFREG|0777, st_size=19344, ...}) = 0
 getdents64(4, /* 0 entries */, 4096) = 0 <<<<
 close(4) = 0
 open("/opt/oracle/product/10.2.0/db_1/rdbms/mesg/kfodus.msb", O_RDONLY) = -1
 ENOENT (No such file or directory)
 open("/opt/oracle/product/10.2.0/db_1/rdbms/mesg/kfodus.msb", O_RDONLY) = -1
 ENOENT (No such file or directory)
 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2a9750d000
write(1, "KFOD-00600: file not found; argu"..., 69) = 69
Node2-working
 -----
 stat("/opt/oracle/extapi/64/asm/orcl/1/libasm.so", {st_mode=S_IFREG|0755, st_size=19344, ...}) = 0
 open("/opt/oracle/extapi/64/asm/orcl/1/libasm.so", O_RDONLY) = 4
read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\23\0"..., 832) = 832
fstat(4, {st_mode=S_IFREG|0755, st_size=19344, ...}) = 0
mmap(NULL, 1066104, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4, 0) 0x2a9750d000

通过MOS的描述，可以明确定位到问题是:libasm.o异常导致

解决方案

To implement the solution, reinstall the ASMlib RPM
>rpm -Uvh oracleasmlib-2.0.0-1
This replaces the /opt/oracle/extapi/64/asm/orcl/1/libasm.so

aix使用太多内存导致shared pool 相关latch异常

某客户有一服务器,shared pool 相关latch出现异常等待,影响系统性能.分析结果:因为系统空闲内存太少,使用太多Paging Space导致该异常;解决办法:1.增加内存，2.在业务接受范围内减小sga等其他和内存消耗相关参数
nmon查看剩余内存

x          Physical  PageSpace |        pages/sec  In     Out | FileSystemCache
x% Used       99.8%     34.9%  | to Paging Space   0.0    0.0 | (numperm) 14.8%
x% Free        0.2%     65.1%  | to File System    0.0   33.0 | Process   63.9%
xMB Used   21452.8MB 11446.1MB | Page Scans        0.0        | System    21.1%
xMB Free (少)-->51.2MB 21321.9MB | Page Cycles       0.0      | Free       0.2%
xTotal(MB) 21504.0MB 32768.0MB | Page Steals       0.0        |           ------

topas查看内存配置

Disk    Busy%     KBPS     TPS KB-Read KB-Writ                   MEMORY
Topas Monitor for host:    p570b03              EVENTS/QUEUES    FILE/TTY
Wed Jan  9 13:30:30 2013   Interval:  2         Cswitch     785  Readch   173.1K
                                                Syscall   54407  Writech  213.1K
CPU  User%  Kern%  Wait%  Idle%                 Reads       118  Rawin         0
ALL   43.6    1.7    0.0   54.8                 Writes      110  Ttyout      352
                                                Forks         0  Igets         0
Network  KBPS   I-Pack  O-Pack   KB-In  KB-Out  Execs         0  Namei         5
Total    84.5    146.0   200.6    26.4    58.1  Runqueue    0.5  Dirblk        0
                                                Waitqueue   0.0
Disk    Busy%     KBPS     TPS KB-Read KB-Writ                   MEMORY
Total     0.0    164.6    17.0     0.0   164.6  PAGING           Real,MB   21504
                                                Faults    12408  % Comp     86 <---大部分计算内存
FileSystem        KBPS     TPS KB-Read KB-Writ  Steals        0  % Noncomp  13 <---fs cache较少
Total            316.3    17.9  151.5  164.9    PgspIn        0  % Client   13
                                                PgspOut       0
Name            PID  CPU%  PgSp Owner           PageIn        0  PAGING SPACE
oracle      6357252  16.7   8.4 oracle          PageOut      42  Size,MB   32768
oracle     15401474   8.3  70.8 oracle          Sios         42  % Used     35  <---使用比较多
oracle     12714542   8.3   8.3 oracle                           % Free     65
oracle      5767556   8.3   8.3 oracle          NFS (calls/sec)
oracle      5898996   8.3 134.9 oracle          SerV2         0  WPAR Activ    0
oracle     17629634   8.3 134.9 oracle          CliV2         0  WPAR Total    0
oracle     13959694   0.0   8.4 oracle          SerV3         0  Press: "h"-help
oracle      5439860   0.0 134.3 oracle          CliV3         0         "q"-quit

内存参数配置

vmo -F -a
--数据库相关参数
minperm% = 3
v_pinshm = 0
lru_file_repage = 0
maxclient% = 90
maxperm% = 90
strict_maxclient = 1
strict_maxperm = 0
page_steal_method = 1

因为是AIX 6.1,这里的vmo配置基本上是oracle 推荐值(大页没有配置,非必须选项)

会话进程占用内存

procmap 15466998
15466998 : oraclewasudb (LOCAL=NO)
100000000            97466K  read/exec         oracle
11000088d             2430K  read/write        oracle
9fffffff0000000         51K  read/exec         /usr/ccs/bin/usla64
9fffffff000cfe2          0K  read/write        /usr/ccs/bin/usla64
900000000b14930          2K  read/exec         /usr/lib/libC.a[shr3_64.o]
9001000a0122930          0K  read/write        /usr/lib/libC.a[shr3_64.o]
900000000af5b00        118K  read/exec         /usr/lib/libC.a[shrcore_64.o]
9001000a0319100         12K  read/write        /usr/lib/libC.a[shrcore_64.o]
900000000ad7000        118K  read/exec         /usr/lib/libC.a[ansicore_64.o]
9001000a030fe00         36K  read/write        /usr/lib/libC.a[ansicore_64.o]
900000000411468          0K  read/exec         /usr/lib/libicudata.a[shr_64.o]
9001000a0121468          0K  read/write        /usr/lib/libicudata.a[shr_64.o]
90000000040f738          2K  read/exec         /usr/lib/libC.a[shr2_64.o]
9001000a0323738          0K  read/write        /usr/lib/libC.a[shr2_64.o]
9000000008ec800       1699K  read/exec         /usr/lib/libC.a[ansi_64.o]
9001000a0324a00        277K  read/write        /usr/lib/libC.a[ansi_64.o]
9000000008c9b00        135K  read/exec         /usr/lib/libC.a[shr_64.o]
9001000a031db00         19K  read/write        /usr/lib/libC.a[shr_64.o]
900000000708180       1732K  read/exec         /usr/lib/libicuuc.a[shr_64.o]
9001000a036bdac        180K  read/write        /usr/lib/libicuuc.a[shr_64.o]
900000000493d80       2510K  read/exec         /usr/lib/libicui18n.a[shr_64.o]
9001000a0399148        270K  read/write        /usr/lib/libicui18n.a[shr_64.o]
900000000473200         91K  read/exec         /usr/lib/libsrc.a[shr_64.o]
9001000a01127a8         55K  read/write        /usr/lib/libsrc.a[shr_64.o]
90000000045a300         98K  read/exec         /usr/lib/libcorcfg.a[shr_64.o]
9001000a04147c8         18K  read/write        /usr/lib/libcorcfg.a[shr_64.o]
900000000b16200        750K  read/exec         /usr/lib/liblvm.a[shr_64.o]
9001000a03dd028        219K  read/write        /usr/lib/liblvm.a[shr_64.o]
900000000444f00         82K  read/exec         /usr/lib/libcfg.a[shr_64.o]
9001000a027b8f0         26K  read/write        /usr/lib/libcfg.a[shr_64.o]
90000000040e3a0          2K  read/exec         /usr/lib/libcrypt.a[shr_64.o]
9001000a0106948          0K  read/write        /usr/lib/libcrypt.a[shr_64.o]
90000000233c860          5K  read/exec         /usr/lib/libc.a[aio_64.o]
9001000a0437568          0K  read/write        /usr/lib/libc.a[aio_64.o]
9000000003efc00        120K  read/exec         /usr/lib/libodm.a[shr_64.o]
9001000a0107cc8         40K  read/write        /usr/lib/libodm.a[shr_64.o]
900000000bd2c80        147K  read/exec         /usr/lib/libperfstat.a[shr_64.o]
9001000a041a960         14K  read/write        /usr/lib/libperfstat.a[shr_64.o]
900000000bf8000          0K  read/exec         /usr/lib/libdl.a[shr_64.o]
9001000a041f000          0K  read/write        /usr/lib/libdl.a[shr_64.o]
9000000024ac100       8680K  read/exec         /oracle/product/10g/lib/libjox10.a[shr.o]
8001000a0000ca0        588K  read/write        /oracle/product/10g/lib/libjox10.a[shr.o]
900000000a96000        257K  read/exec         /usr/lib/libpthreads.a[shr_xpg5_64.o]
9001000a0283000        559K  read/write        /usr/lib/libpthreads.a[shr_xpg5_64.o]
900000000000800       4025K  read/exec         /usr/lib/libc.a[shr_64.o]
9001000a0000020       1047K  read/write        /usr/lib/libc.a[shr_64.o]
         Total      123902K

在上表中，标记为read/write的内存即是进程的私有内存，每个会话大概占用内存近6M,数据库大概有80多个会话,占用内存大概,占用内存大概500M左右.

数据库参数配置

SQL> select sum(PGA_ALLOC_MEM)/1024/1024/1024,count(*) from v$process;
SUM(PGA_ALLOC_MEM)/1024/1024/1024   COUNT(*)
--------------------------------- ----------
                       2.46758329         84
SQL> show parameter pga;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
_pga_max_size                        big integer 500M
pga_aggregate_target                 big integer 2000M
SQL> show parameter sga
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 12000M
sga_target                           big integer 10000M
SQL> show sga;
Total System Global Area 1.2583E+10 bytes
Fixed Size                  2117744 bytes
Variable Size            7600082832 bytes
Database Buffers         4966055936 bytes
Redo Buffers               14655488 bytes

当前系统整体ORACLE使用内存汇总：sga 12G+pga 2.5G+process 0.5G,大概占用内存15G,留给系统内存6G左右,系统使用大量交换分区,导致系统性能下降,最明显的为:shared pool相关latch等待异常,具体awr为:

因为系统因为个别session需要大量内存设置_pga_max_size参数,导致部分会话系统占用2.5g内存,建议设置该参数为默认值,并对个别会话独立设置,设置pga_aggregate_target=1.5G,sga_target=sga_max_size=8.5G,awr结果为:

SQL TUNING导致ORA-07445[qsmmixComputeClusteringFactor()+386]

在11.2.0.2版本中sql tuning可能导致ORA-07445[qsmmixComputeClusteringFactor()+386]错误
系统版本和平台信息

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /oracle/product/11.2.0/dbhome_1
System name:	Linux
Node name:	FPMS01DB
Release:	2.6.18-238.el5
Version:	#1 SMP Sun Dec 19 14:22:44 EST 2010
Machine:	x86_64
VM name:	VMWare Version: 6
Instance name: tis
Redo thread mounted by this instance: 1
Oracle process number: 41
Unix process pid: 16822, image: oracle@FPMS01DB (J003)

alert日志信息

Thu Jan 10 22:00:02 2013
Begin automatic SQL Tuning Advisor run for special tuning task  "SYS_AUTO_SQL_TUNING_TASK"
Thu Jan 10 22:02:39 2013
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x4]
[PC:0x597FEB4, qsmmixComputeClusteringFactor()+386] [flags: 0x0, count: 1]
Errors in file /oracle/diag/rdbms/tis/tis/trace/tis_j001_11073.trc  (incident=80033):
ORA-07445: exception encountered: core dump [qsmmixComputeClusteringFactor()+386]
[SIGSEGV] [ADDR:0x4] [PC:0x597FEB4] [Address not mapped to object] []
Incident details in: /oracle/diag/rdbms/tis/tis/incident/incdir_80033/tis_j001_11073_i80033.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Thu Jan 10 22:02:51 2013
Dumping diagnostic data in directory=[cdmp_20130110220251],
requested by (instance=1, osid=11073 (J001)), summary=[incident=80033].

通过这里可以看出来，出现ORA-07445错误的原因很可能和SQL TUNING相关

trace文件信息

Dump continued from file: /oracle/diag/rdbms/tis/tis/trace/tis_j003_16822.trc
ORA-07445: exception encountered: core dump [qsmmixComputeClusteringFactor()+386]
[SIGSEGV] [ADDR:0x4] [PC:0x597FEB4] [Address not mapped to object] []
========= Dump for incident 80008 (ORA 7445 [qsmmixComputeClusteringFactor()+386]) ========
----- Beginning of Customized Incident Dump(s) -----
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x4]
[PC:0x597FEB4, qsmmixComputeClusteringFactor()+386] [flags: 0x0, count: 1]
Registers:
%rax: 0x00002b0b684593d0 %rbx: 0x0000000000000003 %rcx: 0x0000000000000000
%rdx: 0x00002b0b684593d0 %rdi: 0x00007fff758ceff8 %rsi: 0x0000000000000000
%rsp: 0x00007fff758ced00 %rbp: 0x00007fff758cf5d0  %r8: 0x00002b0b684593a8
 %r9: 0x00002b0b683c8e60 %r10: 0x0000000000000017 %r11: 0x0000000000000000
%r12: 0x00002b0b684197a0 %r13: 0x00002b0b68419928 %r14: 0x00002b0b66ad2430
%r15: 0x00002b0b66d95590 %rip: 0x000000000597feb4 %efl: 0x0000000000010202
  qsmmixComputeClusteringFactor()+366 (0x597fea0) call 0x597fc62
  qsmmixComputeClusteringFactor()+371 (0x597fea5) lea -0x5d8(%rbp),%rdi
  qsmmixComputeClusteringFactor()+378 (0x597feac) mov 0x60(%rax),%rsi
  qsmmixComputeClusteringFactor()+382 (0x597feb0) mov 0x60(%rax),%rcx
> qsmmixComputeClusteringFactor()+386 (0x597feb4) movzwl 0x4(%rcx),%edx
  qsmmixComputeClusteringFactor()+390 (0x597feb8) add $6,%rsi
  qsmmixComputeClusteringFactor()+394 (0x597febc) mov %rax,-0x20(%rbp)
  qsmmixComputeClusteringFactor()+398 (0x597fec0) call 0xa056f0
  qsmmixComputeClusteringFactor()+403 (0x597fec5) mov -0x8(%rbp),%edx
*** 2013-01-09 22:50:24.205
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x3, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=durgh9p25f6bb) -----
/* SQL Analyze(788,1) */ SELECT  distinct
			                   null  as isdistribution,
			                   null as seqno,
			                   to_char(T1.SEQNO) as TRADEID
…………
from T_XIFENFEI
----- PL/SQL Stack -----
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0xc5b644e8     11816  package body SYS.DBMS_SQLTUNE_INTERNAL
0xc54ca918         7  SYS.WRI$_ADV_SQLTUNE
0xd55aec98       587  package body SYS.PRVT_ADVISOR
0xd55aec98      2655  package body SYS.PRVT_ADVISOR
0xc5f66c70       241  package body SYS.DBMS_ADVISOR
0xc5418dc8       821  package body SYS.DBMS_SQLTUNE
0xd50d38c0         4  anonymous block
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+36        call     kgdsdst()            000000000 ? 000000000 ?
                                                   2B0B65B6C098 ? 000000001 ?
                                                   000000001 ? 000000003 ?
ksedst1()+98         call     skdstdst()           000000000 ? 000000000 ?
                                                   2B0B65B6C098 ? 000000001 ?
                                                   000000000 ? 000000003 ?
ksedst()+34          call     ksedst1()            000000001 ? 000000001 ?
                                                   2B0B65B6C098 ? 000000001 ?
                                                   000000000 ? 000000003 ?
dbkedDefDump()+2741  call     ksedst()             000000001 ? 000000001 ?
                                                   2B0B65B6C098 ? 000000001 ?
                                                   000000000 ? 000000003 ?
ksedmp()+36          call     dbkedDefDump()       000000003 ? 000000003 ?
                                                   2B0B65B6C098 ? 000000001 ?
                                                   000000000 ? 000000003 ?
ssexhd()+2366        call     ksedmp()             000000003 ? 000000003 ?
                                                   2B0B65B6C098 ? 000000001 ?
                                                   000000000 ? 000000003 ?
__sighandler()       call     ssexhd()             00000000B ? 2B0B65B74D70 ?
                                                   2B0B65B74C68 ? 000000001 ?
                                                   000000000 ? 000000003 ?
qsmmixComputeCluste  signal   __sighandler()       7FFF758CEFF8 ? 000000000 ?
ringFactor()+386                                   2B0B684593D0 ? 000000000 ?
                                                   2B0B684593A8 ? 2B0B683C8E60 ?
qsmmixGenFakeIdxSta  call     qsmmixComputeCluste  2B0B68419928 ? 7FFF758CEFF8 ?
ts()+1025                     ringFactor()         2B0B66D95590 ?
                                                   BFF0000000000000 ?
                                                   000000000 ? 2B0B683C8E60 ?
qsmmixPopulateIdxSt  call     qsmmixGenFakeIdxSta  2B0B66AD2430 ? 2B0B68419928 ?
ats()+71                      ts()                 2B0B684197A0 ? 2B0B66B92630 ?
                                                   40BAD30000000000 ?
                                                   2B0B683C8E60 ?
qsmmixSetKkotbixt()  call     qsmmixPopulateIdxSt  2B0B66B92630 ? 2B0B66AD2430 ?
+479                          ats()                2B0B68419928 ? 2B0B66B92630 ?
                                                   40BAD30000000000 ?
                                                   2B0B683C8E60 ?
qsmmixReturnCandToO  call     qsmmixSetKkotbixt()  2B0B66B92630 ? 2B0B66AD2430 ?
pt()+656                                           2B0B683F0090 ? 2B0B6845DCC8 ?
                                                   2B0B6845C2B8 ? 2B0B68462F08 ?
qsmmixOptimizerGenI  call     qsmmixReturnCandToO  2B0B66B92630 ? 2B0B66AD2430 ?
dxCand()+601                  pt()                 2B0B68462F60 ? 2B0B6845DCC8 ?
                                                   2B0B6845C2B8 ? 2B0B68462F08 ?
qsmmixOptimizerSetu  call     qsmmixOptimizerGenI  0913F73A0 ? 2B0B6845C2B8 ?
pIdxCand()+351                dxCand()             000000009 ? 000000001 ?
                                                   2B0B6845C2B8 ? 2B0B68462F08 ?
kkoiqb()+13730       call     qsmmixOptimizerSetu  0913F73A0 ? 2B0B683EEFD8 ?
                              pIdxCand()           000000009 ? 000000001 ?
                                                   2B0B6845C2B8 ? 2B0B68462F08 ?
kkooqb()+632         call     kkoiqb()             2B0B66B6FF88 ? 000000000 ?
                                                   000000000 ? 000000001 ?
                                                   2B0B6845C2B8 ? 2B0B68462F08 ?
kkoqbc()+2359        call     kkooqb()             2B0B66B6FF88 ? 000000000 ?
                                                   000000000 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apakkoqb()+166       call     kkoqbc()             7FFF758D2DF0 ? 2B0B66B6FF88 ?
                                                   000000000 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apaqbdDescendents()  call     apakkoqb()           7FFF758D2DF0 ? 2B0B66B6FF88 ?
+457                                               0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apaqbdList()+71      call     apaqbdDescendents()  7FFF758D2DF0 ? 2B0B66B6FF88 ?
                                                   0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apaqbdDescendents()  call     apaqbdList()         7FFF758D2DF0 ? 2B0B66B6FF88 ?
+710                                               0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apaqbdList()+71      call     apaqbdDescendents()  0913F73A0 ? 2B0B66C975C8 ?
                                                   0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apaqbd()+9           call     apaqbdList()         0913F73A0 ? 2B0B66C975C8 ?
                                                   0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
apadrv()+861         call     apaqbd()             0913F73A0 ? 2B0B66C975C8 ?
                                                   0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
opitca()+1971        call     apadrv()             0913F73A0 ? 2B0B66C975C8 ?
                                                   0913F73A0 ? 000000004 ?
                                                   2B0B00000000 ? 2B0B00000000 ?
kksSetBindType()+76  call     opitca()             2B0B671A77F8 ? 0913F73A0 ?
05                                                 7FFF758D4D10 ? 000000004 ?
                                                   100000000 ? 2B0B00000000 ?
kksfbc()+10664       call     kksSetBindType()     7FFF758D4D10 ? 2B0B671A77F8 ?
                                                   7FFF758D4DA0 ? 000000102 ?
                                                   7FFF0000001F ? 000000000 ?
opiexe()+2268        call     kksfbc()             2B0B671A77F8 ? 000000003 ?
                                                   000000102 ? 000000000 ?
                                                   000000000 ? 7FFF758D5D70 ?
kpoal8()+2226        call     opiexe()             000000049 ? 000000003 ?
                                                   7FFF758D6310 ? 000000000 ?
                                                   000000000 ? 7FFF758D5D70 ?
opiodr()+910         call     kpoal8()             00000005E ? 000000000 ?
                                                   7FFF758DA288 ? 000000000 ?
                                                   000000000 ? 000000001 ?
kpoodrc()+31         call     opiodr()             00000005E ? 000000000 ?
                                                   7FFF758DA288 ? 000000000 ?
                                                   00989C970 ? 000000001 ?
rpiswu2()+1618       call     kpoodrc()            7FFF758D75C0 ? 000000000 ?
                                                   7FFF758DA288 ? 000000000 ?
                                                   00989C970 ? 000000001 ?
kpoodr()+617         call     rpiswu2()            0D3C52C80 ? 000000023 ?
                                                   2B0B65EDFAAC ? 000000004 ?
                                                   2B0B6C6B7340 ? 000000023 ?
upirtrc()+2417       call     kpoodr()             2B0B65EBF170 ? 00000005E ?
                                                   7FFF758DA288 ? 000000000 ?
                                                   2B0B6C6B7340 ? 000000023 ?
kpurcsc()+93         call     upirtrc()            2B0B65EBF170 ? 00000005E ?
                                                   7FFF758DA288 ? 7FFF758DA400 ?
                                                   7FFF758DB420 ? 2B0B65C7D308 ?
kpuexec()+10804      call     kpurcsc()            2B0B65EBF170 ? 00000005E ?
                                                   7FFF758DA288 ? 7FFF758DA400 ?
                                                   7FFF758DB420 ? 2B0B65C7D308 ?
OCIStmtExecute()+34  call     kpuexec()            2B0B65E3E5B8 ? 2B0B65E3E538 ?
                                                   7FFF758DA280 ? 000000000 ?
                                                   E0C28C4F00000000 ?
                                                   000000000 ?
qksanExecSql()+743   call     OCIStmtExecute()     2B0B65E3E5B8 ? 2B0B65E3E538 ?
                                                   7FFF758DA280 ? 000000000 ?
                                                   E0C28C4F00000000 ?
                                                   000000000 ?
qksanAnalyzeSql()+2  call     qksanExecSql()       7FFF758DDCA8 ? 2B0B65C7D308 ?
363                                                2B0B65C7D3E8 ? 2B0B65E3E538 ?
                                                   7FFF758DDC28 ? 7FFF758DE3A9 ?
qsmmixProcessQuery(  call     qksanAnalyzeSql()    7FFF758DDCA8 ? 2B0B67E90DA8 ?
)+1089                                             00000B3AB ? 2B0B65FBD218 ?
                                                   100000023 ? 7FFF758DE3A9 ?
qsmmixSqlTuneAnalyz  call     qsmmixProcessQuery(  2B0B65FBD218 ? 000000005 ?
eIdx()+449                    )                    2B0B67E90DA8 ? 2B0B65F37BF4 ?
                                                   2B0B65FBD218 ? 7FFF00000023 ?
kestsiIndexAnalyzeD  call     qsmmixSqlTuneAnalyz  7FFF758DE370 ? 000000005 ?
rv()+794                      eIdx()               2B0B67E90DA8 ? 2B0B65F37BF4 ?
                                                   2B0B65FBD218 ? 7FFF00000023 ?
kestsTuneSqlDrv()+3  call     kestsiIndexAnalyzeD  7FFF758E0548 ? 7FFF758E08C0 ?
83                            rv()                 7FFF758DFC78 ? 7FFF758E09B0 ?
                                                   7FFF758DED50 ? 7FFF758DED58 ?
kesaiExecAction()+9  call     kestsTuneSqlDrv()    7FFF758E0540 ? 7FFF758E08C0 ?
81                                                 7FFF758DFC78 ? 7FFF758E09B0 ?
                                                   7FFF758DED50 ? 7FFF758DED58 ?
kesaiTuneSqlDrv()+6  call     kesaiExecAction()    7FFF758DF420 ? 7FFF758E08C0 ?
258                                                7FFF758E09B0 ? 7FFF758E05E0 ?
                                                   7FFF758E0A68 ? 7FFF758E0A60 ?
spefcifa()+225       call     kesaiTuneSqlDrv()    7FFF758E21B8 ? 2B0B65CA3180 ?
                                                   000000000 ? 7FFF758E05E0 ?
                                                   0C5FB0033 ? 2B0B65C362B8 ?
spefmccallstd()+421  call     spefcifa()           7FFF758E1CE0 ? 000000004 ?
                                                   2B0B65CA3108 ? 7FFF758E0F30 ?
                                                   0C5FB0033 ? 2B0B65C362B8 ?
pextproc()+36        call     spefmccallstd()      7FFF758E20E0 ? 7FFF758E1D88 ?
                                                   7FFF758E1AA0 ? 7FFF758E1CE0 ?
                                                   000000000 ? 2B0B65C362B8 ?
__PGOSF589_peftrust  call     pextproc()           7FFF758E20E0 ? 7FFF758E1D88 ?
ed()+145                                           7FFF758E1AA0 ? 7FFF758E1CE0 ?
                                                   000000000 ? 2B0B65C362B8 ?
__PGOSF633_psdexsp(  call     __PGOSF589_peftrust  7FFF758E20E0 ? 7FFF758E1D88 ?
)+255                         ed()                 7FFF758E1AA0 ? 7FFF758E1CE0 ?
                                                   000000000 ? 2B0B65C362B8 ?
rpiswu2()+1618       call     __PGOSF633_psdexsp(  7FFF758E18D0 ? 7FFF758E20E0 ?
                              )                    7FFF758E1AA0 ? 000020003 ?
                                                   0037500E0 ? 7FFF758E3100 ?
psdextp()+695        call     rpiswu2()            0D3C52C80 ? 000000000 ?
                                                   7FFF758E1910 ? 000000002 ?
                                                   7FFF758E1950 ? 000000000 ?
pefccal()+726        call     psdextp()            7FFF758E3100 ? 7FFF758E1D88 ?
                                                   7FFF758E1AA0 ? 000000000 ?
                                                   000020003 ? 7FFF758E20E0 ?
pefcal()+219         call     pefccal()            7FFF758E20E0 ? 000A324C2 ?
                                                   00B7C8EA0 ? 000000000 ?
                                                   000020003 ? 7FFF758E20E0 ?
pevm_FCAL()+164      call     pefcal()             7FFF758E20E0 ? 00B7C9050 ?
                                                   2B0B65C835B8 ? 000000000 ?
                                                   000020003 ? 7FFF758E20E0 ?
pfrinstr_FCAL()+70   call     pevm_FCAL()          2B0B65C835B8 ? 0A6FB80D8 ?
                                                   2B0B65C835B8 ? 0A6FB8030 ?
                                                   000020003 ? 7FFF758E20E0 ?
pfrrun_no_tool()+63  call     pfrinstr_FCAL()      2B0B65C835B8 ? 0A5405930 ?
                                                   2B0B65C83628 ? 0A6FB8030 ?
                                                   000020003 ? 7FFF758E20E0 ?
pfrrun()+622         call     pfrrun_no_tool()     2B0B65C835B8 ? 0A5405930 ?
                                                   2B0B65C83628 ? 0A6FB8030 ?
                                                   000020003 ? 7FFF758E20E0 ?
plsql_run()+644      call     pfrrun()             2B0B65C835B8 ? 000000000 ?
                                                   2B0B65C83628 ? 7FFF758E3100 ?
                                                   000020003 ? 0928C7098 ?
peicnt()+296         call     plsql_run()          2B0B65C835B8 ? 000000001 ?
                                                   000000000 ? 7FFF758E3100 ?
                                                   000020003 ? 000000000 ?
kkxexe()+521         call     peicnt()             7FFF758E3100 ? 2B0B65C835B8 ?
                                                   2B0B65CA4FD8 ? 7FFF758E3100 ?
                                                   2B0B65CA2F30 ? 000000000 ?
opiexe()+17478       call     kkxexe()             2B0B65C8BD58 ? 2B0B65C835B8 ?
                                                   000000000 ? 7FFF758E3100 ?
                                                   2B0B65CA2F30 ? 000000000 ?
kpoal8()+2226        call     opiexe()             000000049 ? 000000003 ?
                                                   7FFF758E4730 ? 7FFF758E3100 ?
                                                   2B0B65CA2F30 ? 000000000 ?
opiodr()+910         call     kpoal8()             00000005E ? 000000000 ?
                                                   7FFF758E81A8 ? 7FFF758E3100 ?
                                                   2B0B65CA2F30 ? 7FFF00000001 ?
kpoodr()+648         call     opiodr()             00000005E ? 000000000 ?
                                                   7FFF758E81A8 ? 000000000 ?
                                                   00989C970 ? 7FFF00000001 ?
upirtrc()+2417       call     kpoodr()             2B0B65C796A8 ? 00000005E ?
                                                   7FFF758E81A8 ? 000000000 ?
                                                   00989C970 ? 7FFF00000001 ?
kpurcsc()+93         call     upirtrc()            2B0B65C796A8 ? 00000005E ?
                                                   7FFF758E81A8 ? 7FFF758E8320 ?
                                                   7FFF758E9340 ? 2B0B65C7D308 ?
kpuexec()+11692      call     kpurcsc()            2B0B65C796A8 ? 00000005E ?
                                                   7FFF758E81A8 ? 7FFF758E8320 ?
                                                   7FFF758E9340 ? 2B0B65C7D308 ?
OCIStmtExecute()+34  call     kpuexec()            2B0B65C88290 ? 2B0B65C88210 ?
                                                   7FFF758E81A0 ? 000000001 ?
                                                   7FFF00000000 ? 000000000 ?
__PGOSF529_jslvec_e  call     OCIStmtExecute()     2B0B65C88290 ? 2B0B65C88210 ?
xeccb()+2207                                       7FFF758E81A0 ? 000000001 ?
                                                   7FFF00000000 ? 000000000 ?
jslvswu()+54         call     __PGOSF529_jslvec_e  7FFF758EC39C ? 2B0B65C88210 ?
                              xeccb()              2B0B65C88210 ? 000000001 ?
                                                   7FFF00000000 ? 000000000 ?
jslve_execute0()+22  call     jslvswu()            000000000 ? 7FFF00000000 ?
17                                                 000000000 ? 000000001 ?
                                                   7FFF00000000 ? 000000000 ?
jslve_execute()+327  call     jslve_execute0()     7FFF758EE2B4 ? 000005946 ?
                                                   000000002 ? 7FFF758EE2A0 ?
                                                   000000000 ? 0FFFFFFFF ?
rpiswu2()+1618       call     jslve_execute()      7FFF758EE150 ? 000000002 ?
                                                   7FFF758EE2B4 ? 000005946 ?
                                                   7FFF758EE2A0 ? 0FFFFFFFF ?
kkjex1e()+374        call     rpiswu2()            0D3C52C80 ? 000000000 ?
                                                   7FFF758EE170 ? 000000002 ?
                                                   7FFF758EE190 ? 000000000 ?
kkjsexe()+705        call     kkjex1e()            7FFF758EE2B4 ? 000005946 ?
                                                   000000002 ? 7FFF758EE2A0 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
kkjrdp()+689         call     kkjsexe()            7FFF758EE2B4 ? 000005946 ?
                                                   000000001 ? 7FFF758EE2A0 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
opirip()+953         call     kkjrdp()             7FFF758EE2B4 ? 000005946 ?
                                                   000000001 ? 7FFF758EE2A0 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
opidrv()+598         call     opirip()             000000032 ? 000000004 ?
                                                   7FFF758EFA28 ? 7FFF758EE2A0 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
sou2o()+98           call     opidrv()             000000032 ? 000000004 ?
                                                   7FFF758EFA28 ? 7FFF758EE2A0 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
opimai_real()+261    call     sou2o()              7FFF758EFA00 ? 000000032 ?
                                                   000000004 ? 7FFF758EFA28 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
ssthrdmain()+252     call     opimai_real()        000000000 ? 7FFF758EFBF0 ?
                                                   000000004 ? 7FFF758EFA28 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
main()+196           call     ssthrdmain()         000000003 ? 7FFF758EFBF0 ?
                                                   000000001 ? 000000000 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
__libc_start_main()  call     main()               000000003 ? 7FFF758EFD90 ?
+244                                               000000001 ? 000000000 ?
                                                   0D57A5F08 ? 7FFF758EE208 ?
_start()+36          call     __libc_start_main()  000A077C8 ? 000000001 ?
                                                   7FFF758EFD88 ? 000000000 ?
                                                   0D57A5F08 ? 000000003 ?
--------------------- Binary Stack Dump ---------------------

分析Stack可以发现，他们和MOS[]中的非常类此,可以断定是该Bug 9746210
skdstdst <- ksedst1 <- ksedst <- dbkedDefDump <- ksedmp<- ssexhd <- sighandler <- qsmmixComputeClusteringFactor <- qsmmixGenFakeIdxStats <- qsmmixPopulateIdxStats <- qsmmixSetKkotbixtqsmmixReturnCandToOpt <- qsmmixOptimizerGenIdxCand <- qsmmixOptimizerSetupIdxCandResponse
处理建议

The is fixed in following versions:
 12.1 (Future Release)
 11.2.0.3 (Server Patch Set)
 11.2.0.2.4 Patch Set Update
 11.2.0.2 Bundle Patch 12 for Exadata Database
 11.2.0.2 Patch 11 on Windows Platforms
To resolve the issue:
Either
Upgrade to the over versions
Or
Apply Patch 9746210

补充说明
对于SQL TUNING功能,在觉得多少人的环境中都不需要这个,可以考虑禁用该功能来屏蔽该错误

BEGIN
dbms_auto_task_admin.disable(
    client_name => 'sql tuning advisor',
    operation   => NULL,
    window_name => NULL);
END;

参考文档
SQL Tuning query fails with ORA-7445 [qsmmixComputeClusteringFactor] [ID 1483654.1]
Error ORA-07445 Qsmmixcomputeclusteringfactor From SQL Tuning [ID 1359148.1]
Bug 9746210 – ORA-7445 [qsmmixComputeClusteringFactor] from SQL tuning [ID 9746210.8]

重建控制文件引发ORA-00218故障

遇到一个案例在数据库启动的时候报ORA-00218错误,而这个故障的引起原因是因为重建控制文件的时候,有一个控制文件无法创建,而导致了原有的控制文件被破坏,提醒:创建控制文件之前,最好对原有控制文件进行备份
数据库启动报ORA-00218错误

SQL> startup
ORACLE instance started.
Total System Global Area  285212672 bytes
Fixed Size                  2020224 bytes
Variable Size              92277888 bytes
Database Buffers          188743680 bytes
Redo Buffers                2170880 bytes
ORA-00218: block size 0 of control file
'/u01/app/oracle/oradata/zxy/control01.ctl' does not match DB_BLOCK_SIZE (0)

分析ORA-00218错误

Oracle10g Release 1 Message
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Error:	  ORA-00218  (ORA-218)
Text:	  block size %s of controlfile '%s' does not match DB_BLOCK_SIZE
	  (%s)
---------------------------------------------------------------------------
Cause:	The block size as stored in the controlfile header is different
	from the value of the initialization parameter DB_BLOCK_SIZE. This
	might be due to an incorrect setting of DB_BLOCK_SIZE, or else
	might indicate that the controlfile has either been corrupted or
	belongs to a different database.
Action:	Restore a good copy of the controlfile. If the controlfile is
	known to be clean set the DB_BLOCK_SIZE to match controlfile
	headers block size value.
Oracle 9.2 or Earlier Error Message
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Error:  ORA 218
Text:   control file <name> was created with block size <num> now is <num>
-------------------------------------------------------------------------------
Cause:  The physical block size, stored in the control file header, was
        different in physical block size returned by the O/S.
        This usually indicates that the control file was corrupted.
Action: Restore a good copy of the control file.
        For more information about control files and recovery, see the index
        entries on "control files," "control files, backing up," "control
        files, recovery and" in <Oracle7 Server Concepts>.

通过这里可以知道,很可能是控制文件header的db_block_size和参数文件中的db_block_size的大小不一致,从而导致了该问题,而从启动数据库的错误提示上看,是控制文件的block size 为0.

分析控制文件

--dbv检查控制文件
[oracle@zxy bdump]$ dbv file='/u01/app/oracle/oradata/zxy/control01.ctl' blocksize=16384
DBVERIFY: Release 10.2.0.1.0 - Production on Sun Jan 6 23:39:32 2013
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/zxy/control01.ctl
DBVERIFY - Verification complete
Total Pages Examined         : 450
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 0
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 450
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 0 (0.0)
--strings读控制文件
[oracle@zxy bdump]$ strings /u01/app/oracle/oradata/zxy/control01.ctl
}|{z
--正常库dbv检查控制文件
E:\oracle\oradata\xifenfei>dbv file=CONTROL01.CTL blocksize=16384
DBVERIFY: Release 11.2.0.3.0 - Production on 星期一 1月 7 10:26:46 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - 开始验证: FILE = E:\ORACLE\ORADATA\XIFENFEI\CONTROL01.CTL
DBVERIFY - 验证完成
检查的页总数: 600
处理的页总数 (数据): 0
失败的页总数 (数据): 0
处理的页总数 (索引): 0
失败的页总数 (索引): 0
处理的页总数 (其他): 65
处理的总页数 (段)  : 0
失败的总页数 (段)  : 0
空的页总数: 535
标记为损坏的总页数: 0
流入的页总数: 0
加密的总页数        : 0
最高块 SCN            : 39198 (65535.39198)

检查参数文件db_block_size

SQL> show parameter db_block_size
TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_block_size                        integer     8192

通过分析我们知道spfile中的db_block_size是正确的,而控制文件通过dbv和strings检测均为空值,证明是控制文件异常导致该问题,对于该问题可以通过重建控制文件或者还原备份控制文件来解决问题.

分析问题原因

--参数文件配置
control_files            = /u01/app/oracle/oradata/zxy/control01.ctl, /tmp/oradata/control04.ctl
--alert日志
Sun Jan  6 21:42:50 2013
CREATE CONTROLFILE REUSE DATABASE "ZXY" RESETLOGS  NOARCHIVELOG
 …………
CHARACTER SET AL32UTF8
Sun Jan  6 21:42:50 2013
WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Sun Jan  6 21:42:53 2013
Errors in file /u01/app/oracle/admin/zxy/udump/zxy_ora_3898.trc:
ORA-00200: control file could not be created
ORA-00202: control file: '/tmp/oradata/control04.ctl'
ORA-27040: file create error, unable to create file
Linux-x86_64 Error: 2: No such file or directory
--再次启动
control_files            = /u01/app/oracle/oradata/zxy/control01.ctl
--日志
ALTER DATABASE   MOUNT
Sun Jan  6 21:56:31 2013
ORA-00218: block size 0 of control file
'/u01/app/oracle/oradata/zxy/control01.ctl' does not match DB_BLOCK_SIZE (0)
Sun Jan  6 21:56:31 2013
ORA-218 signalled during: ALTER DATABASE   MOUNT...

创建控制文件,因为/tmp/oradata/目录不存在或者没有权限导致创建控制文件失败,而导致原来有的控制文件也失败

故障重现

--正常启动
control_files='/u01/oracle/oradata/XFF/control01.ctl'
SQL> startup pfile=/tmp/pfile
ORACLE instance started.
Total System Global Area  306184192 bytes
Fixed Size                  1267164 bytes
Variable Size             109054500 bytes
Database Buffers          188743680 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> alter database backup controlfile to trace as '/tmp/ctl';
Database altered.
--尝试重建控制文件
control_files='/u01/oracle/oradata/XFF/control01.ctl','/tmp/xifenfei/con.ctl'
SQL> CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  ARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 292
  7  LOGFILE
  8    GROUP 1 '/u01/oracle/oradata/XFF/redo01.log'  SIZE 50M,
  9    GROUP 2 '/u01/oracle/oradata/XFF/redo02.log'  SIZE 50M,
 10    GROUP 3 '/u01/oracle/oradata/XFF/redo03.log'  SIZE 50M
 11  DATAFILE
 12    '/u01/oracle/oradata/XFF/system01.dbf',
 13    '/u01/oracle/oradata/XFF/undotbs01.dbf',
 14    '/u01/oracle/oradata/XFF/sysaux01.dbf',
 15    '/u01/oracle/oradata/XFF/users01.dbf',
 16    '/u01/oracle/oradata/XFF/xifenfei01.dbf',
 17    '/u01/oracle/oradata/XFF/users03.dbf'
 18  CHARACTER SET ZHS16GBK
 19  ;
CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  ARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-00200: control file could not be created
ORA-00202: control file: '/tmp/xifenfei/con.ctl'
ORA-27040: file create error, unable to create file
Linux Error: 2: No such file or directory
--使用原控制文件启动库
control_files='/u01/oracle/oradata/XFF/control01.ctl'
SQL> shutdown abort
ORACLE instance shut down.
SQL> startup pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area  306184192 bytes
Fixed Size                  1267164 bytes
Variable Size             109054500 bytes
Database Buffers          188743680 bytes
Redo Buffers                7118848 bytes
ORA-00218: block size 0 of control file '/u01/oracle/oradata/XFF/control01.ctl'
does not match DB_BLOCK_SIZE (0)

补充参数文件中DB_BLOCK_SIZE不正确导致后果

DB_BLOCK_SIZE从8192修改为16384
SQL> startup pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area  306184192 bytes
Fixed Size                  1267164 bytes
Variable Size             109054500 bytes
Database Buffers          188743680 bytes
Redo Buffers                7118848 bytes
ORA-00058: DB_BLOCK_SIZE must be 8192 to mount this database (not 16384)

ORACLE 12C varchar2支持32k长度字符串

在Oracle的以前版本中如果要存储超过4000byte的字符串需要使用clob字段,而lob本身操作就麻烦,而且效率不高。从12C开始Oracle提供了 VARCHAR2, NVARCHAR2, and RAW支持32k长度在字符串,大大提高了Oracle程序在处理4000到32k的字符串的处理效率.
数据库版本

SQL> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.0.2 - 64bit                         0
PL/SQL Release 12.1.0.0.2                                                                 0
CORE    12.1.0.0.2                                                                        0
TNS for Linux: Version 12.1.0.0.2                                                         0
NLSRTL Version 12.1.0.0.2                                                                 0

max_sql_string_size参数

SQL> show parameter max_sql_string_size;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
max_sql_string_size                  string      LEGACY
max_sql_string_size:controls maximum size of VARCHAR2, NVARCHAR2, and RAW types in SQL
max_sql_string_size, must be from among EXTENDED, LEGACY

创建测试表

SQL> create table t_xifenfei(id number,name varchar2(4001));
create table t_xifenfei(id number,name varchar2(4001))
                                                *
ERROR at line 1:
ORA-00910: specified length too long for its datatype
SQL> create table t_xifenfei(id number,name varchar2(4000));
Table created.
SQL> insert into t_xifenfei values(1,rpad('www.xifenfei.com',4000,0));
1 row created.
SQL> insert into t_xifenfei values(2,lpad('www.xifenfei.com',4009,0));
1 row created.
SQL> commit;
Commit complete.
SQL> select id,length(name) from t_xifenfei;
        ID LENGTH(NAME)
---------- ------------
         1         4000
         2         4000
SQL> select id,substr(name,-10,10) from t_xifenfei;
        ID SUBSTR(NAME,-10,10)
---------- ----------------------------------------
         1 0000000000
         2 fenfei.com
SQL>  select id,substr(name,3990) from t_xifenfei;
        ID SUBSTR(NAME,3990)
---------- --------------------------------------------
         1 00000000000
         2 ifenfei.com

测试说明几点：
1.默认情况下varchar2长度不能超过4000
2.插入varchar2超过4000的字段(列长度为4000),自动被截断

修改max_sql_string_size参数

SQL> alter system set max_sql_string_size='EXTENDED';
alter system set max_sql_string_size='EXTENDED'
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-02096: specified initialization parameter is not modifiable with this option
SQL> alter system set max_sql_string_size='EXTENDED' scope=spfile;
System altered.

执行utl32k.sql脚本

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup upgrade;
ORACLE instance started.
Total System Global Area  313159680 bytes
Fixed Size                  2259912 bytes
Variable Size             243270712 bytes
Database Buffers           62914560 bytes
Redo Buffers                4714496 bytes
Database mounted.
Database opened.
SQL> @?/rdbms/admin/utl32k.sql
--包含编译无效对象
QL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area  313159680 bytes
Fixed Size                  2259912 bytes
Variable Size             251659320 bytes
Database Buffers           54525952 bytes
Redo Buffers                4714496 bytes
Database mounted.
Database opened.
SQL> show parameter max_sql_string_size;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------
max_sql_string_size                  string      EXTENDED

测试varchar2(32767)

SQL> conn xff/xifenfei
Connected.
SQL> create table t_xifenfei_1(id number,name varchar2(32768));
create table t_xifenfei_1(id number,name varchar2(32768))
                                                  *
ERROR at line 1:
ORA-00910: specified length too long for its datatype
SQL> create table t_xifenfei_extend(id number,name varchar2(32767));
Table created.
SQL> insert into t_xifenfei_extend values(3,lpad('www.xifenfei.com',32767,0));
1 row created.
SQL> commit;
Commit complete.
SQL> select id,substr(name,-10,10) from t_xifenfei_extend;
        ID SUBSTR(NAME,-10,10)
---------- ----------------------------------------
         3 fenfei.com
SQL> select id,substr(name,32760) from t_xifenfei_extend;
        ID SUBSTR(NAME,32760)
---------- --------------------------------
         3 nfei.com
SQL> select id,length(name) from t_xifenfei_extend;
        ID LENGTH(NAME)
---------- ------------
         3        32767

Oracle 12C支持字符串32K处理过程
1.修改max_sql_string_size=’EXTENDED’
2.重启数据库至upgrade状态
3.执行@?/rdbms/admin/utl32k.sql
4.重启数据库至正常open状态

dcli完成多节点对等ssh配置

dcli是Python脚本,可以实现在多节点中的之间非交换命令的执行,因为dcli执行需要通过ssh来实现,所以dcli也提供了ssh配置功能.本文通过dcli来完成多节点间的对等ssh配置
dcli整体描述

The dcli utility runs commands on multiple cells in parallel threads.
However, it does not support an interactive session with a remote application on a cell.
To use the dcli utility, copy the utility from the bin directory on a cell to
a host computer from which central management can be performed.
You can issue a command to be run on multiple cells, or use files that can be copied to cells and then run.
The cells are referenced by their domain name or IP address.
The dcli utility requires Python version 2.3 or later.
You can determine the version of Python by running the python -V command.
In addition, use of this tool assumes prior setup of SSH user-equivalence to a cell.
You can use the dcli utility initially with the -k option to set up SSH user-equivalence to a cell.

dcli使用说明

[xifenfei@db1 ~]$ dcli
Error: No command specified.
usage: dcli [options] [command]
options:
  --version           show program's version number and exit
  -c CELLS            comma-separated list of cells
  -d DESTFILE         destination directory or file
  -f FILE             file to be copied
  -g GROUPFILE        file containing list of cells
  -h, --help          show help message and exit
  -k                  push ssh key to cell's authorized_keys file
  -l USERID           user to login as on remote cells (default: celladmin)
  -n                  abbreviate non-error output
  -r REGEXP           abbreviate output lines matching a regular expression
  -s SSHOPTIONS       string of options passed through to ssh
  --scp=SCPOPTIONS    string of options passed through to scp if different
                      from sshoptions
  -t                  list target cells
  -v                  print extra messages to stdout
  --vmstat=VMSTATOPS  vmstat command options
  -x EXECFILE         file to be copied and executed

服务器相关ip配置

[xifenfei@db1 ~]$ more xifenfei.txt
192.168.30.10
192.168.30.20
192.168.30.30
[xifenfei@db1 ~]$ more /etc/hosts
127.0.0.1 localhost.localdomain localhost
192.168.30.30   db1
192.168.30.10   cell1
192.168.30.20   cell2

操作系统用户
说明:dcli配置对等ssh不需要uid完全一样,不需要用户密码完全一样,因为是双向对等,需要用户名一致

[root@cell2 ~]# id xifenfei
uid=8001(xifenfei) gid=8001(xifenfei) groups=8001(xifenfei)
[root@cell1 ~]# id xifenfei
uid=8001(xifenfei) gid=8001(xifenfei) groups=8001(xifenfei)
[root@db1 ~]# id xifenfei
uid=8001(xifenfei) gid=8001(xifenfei) groups=8001(xifenfei)

db1节点配置ssh

[xifenfei@db1 ~]$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/home/xifenfei/.ssh/id_dsa):
Created directory '/home/xifenfei/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/xifenfei/.ssh/id_dsa.
Your public key has been saved in /home/xifenfei/.ssh/id_dsa.pub.
The key fingerprint is:
63:95:13:ba:4a:4c:13:93:67:7f:4f:e8:18:13:3c:4f xifenfei@db1
[xifenfei@db1 ~]$ dcli -k -g xifenfei.txt -l xifenfei
The authenticity of host '192.168.30.10 (192.168.30.10)' can't be established.
RSA key fingerprint is 1b:b6:91:11:58:89:b1:6a:c6:eb:72:df:68:d4:dd:5b.
Are you sure you want to continue connecting (yes/no)? yes
xifenfei@192.168.30.10's password:
The authenticity of host '192.168.30.20 (192.168.30.20)' can't be established.
RSA key fingerprint is 1b:b6:91:11:58:89:b1:6a:c6:eb:72:df:68:d4:dd:5b.
Are you sure you want to continue connecting (yes/no)? yes
xifenfei@192.168.30.20's password:
The authenticity of host '192.168.30.30 (192.168.30.30)' can't be established.
RSA key fingerprint is 54:ea:84:ae:38:24:07:31:9f:dd:8a:8b:4b:c2:a8:fe.
Are you sure you want to continue connecting (yes/no)? yes
xifenfei@192.168.30.30's password:
192.168.30.10: Warning: Permanently added '192.168.30.10' (RSA) to the list of known hosts.
192.168.30.10: ssh key added
192.168.30.20: Warning: Permanently added '192.168.30.20' (RSA) to the list of known hosts.
192.168.30.20: ssh key added
192.168.30.30: Warning: Permanently added '192.168.30.30' (RSA) to the list of known hosts.
192.168.30.30: ssh key added
[xifenfei@db1 ~]$ for host in `cat xifenfei.txt`
> do
> scp /home/xifenfei/.ssh/id_dsa.pub ${host}:/home/xifenfei/.ssh/authorized_keys
> done
id_dsa.pub                                                                      100%  602     0.6KB/s   00:00
id_dsa.pub                                                                      100%  602     0.6KB/s   00:00
id_dsa.pub                                                                      100%  602     0.6KB/s   00:00
[xifenfei@db1 ~]$ dcli -g xifenfei.txt -l xifenfei "chmod -R 700 /home/xifenfei/.ssh"
[xifenfei@db1 ~]$ dcli -g xifenfei.txt -l xifenfei "chown -R xifenfei /home/xifenfei/.ssh"
[xifenfei@db1 ~]$ ssh 192.168.30.30
Last login: Tue Dec 25 07:45:17 2012 from 192.168.30.30
[xifenfei@db1 ~]$ ssh 192.168.30.20
Last login: Tue Dec 25 19:17:30 2012 from 192.168.30.10
[xifenfei@db1 ~]$ ssh 192.168.30.10
Last login: Tue Dec 25 20:17:20 2012 from 192.168.30.20
--ssh为单向，正向可以ssh成功，逆向需要输入密码
[xifenfei@db1 ~]$ ssh 192.168.30.10
xifenfei@192.168.30.10's password:

拷贝ip文件到其他节点

[xifenfei@db1 ~]$ for host in `cat xifenfei.txt`
> do
> scp /home/xifenfei/xifenfei.txt ${host}:/home/xifenfei/xifenfei.txt
> done
xifenfei.txt                                                                    100%   42     0.0KB/s   00:00
xifenfei.txt                                                                    100%   42     0.0KB/s   00:00
scp: /home/xifenfei/xifenfei.txt: Permission denied   --自身节点不能拷贝

cell1节点配置

[xifenfei@cell1 ~]$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/home/xifenfei/.ssh/id_dsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/xifenfei/.ssh/id_dsa.
Your public key has been saved in /home/xifenfei/.ssh/id_dsa.pub.
The key fingerprint is:
f2:b6:88:5c:c6:97:5e:38:c2:df:f1:58:49:8a:8d:90 xifenfei@cell1
[xifenfei@cell1 ~]$  dcli -k -g xifenfei.txt -l xifenfei
The authenticity of host '192.168.30.10 (192.168.30.10)' can't be established.
RSA key fingerprint is 1b:b6:91:11:58:89:b1:6a:c6:eb:72:df:68:d4:dd:5b.
Are you sure you want to continue connecting (yes/no)? yes
xifenfei@192.168.30.10's password:
xifenfei@192.168.30.20's password:
xifenfei@192.168.30.30's password:
192.168.30.10: Warning: Permanently added '192.168.30.10' (RSA) to the list of known hosts.
192.168.30.10: ssh key added
192.168.30.20: ssh key added
192.168.30.30: ssh key added
[xifenfei@cell1 ~]$ ssh 192.168.30.30
Last login: Tue Dec 25 07:48:24 2012 from 192.168.30.30
--cell1 to cell2 正向成功,逆向失败
[xifenfei@cell1 ~]$ ssh 192.168.30.20
Last login: Tue Dec 25 19:23:42 2012 from 192.168.30.30
[xifenfei@cell2 ~]$ ssh 192.168.30.10
xifenfei@192.168.30.10's password:
--cell1和db1正逆向均可以ssh
[xifenfei@cell1 ~]$ ssh 192.168.30.30
Last login: Tue Dec 25 20:24:15 2012 from 192.168.30.30
[xifenfei@db1 ~]$ ssh 192.168.30.10
Last login: Tue Dec 25 20:27:27 2012 from 192.168.30.10

cell2节点配置

[xifenfei@cell2 ~]$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/home/xifenfei/.ssh/id_dsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/xifenfei/.ssh/id_dsa.
Your public key has been saved in /home/xifenfei/.ssh/id_dsa.pub.
The key fingerprint is:
87:80:02:e1:27:b8:d0:af:c0:5f:e0:f3:5e:95:29:cb xifenfei@cell2
[xifenfei@cell2 ~]$ dcli -k -g xifenfei.txt -l xifenfei
The authenticity of host '192.168.30.10 (192.168.30.10)' can't be established.
RSA key fingerprint is 1b:b6:91:11:58:89:b1:6a:c6:eb:72:df:68:d4:dd:5b.
Are you sure you want to continue connecting (yes/no)? yes
xifenfei@192.168.30.10's password:
The authenticity of host '192.168.30.20 (192.168.30.20)' can't be established.
RSA key fingerprint is 1b:b6:91:11:58:89:b1:6a:c6:eb:72:df:68:d4:dd:5b.
Are you sure you want to continue connecting (yes/no)? yes
xifenfei@192.168.30.20's password:
The authenticity of host '192.168.30.30 (192.168.30.30)' can't be established.
RSA key fingerprint is 54:ea:84:ae:38:24:07:31:9f:dd:8a:8b:4b:c2:a8:fe.
Are you sure you want to continue connecting (yes/no)? yes
xifenfei@192.168.30.30's password:
192.168.30.10: Warning: Permanently added '192.168.30.10' (RSA) to the list of known hosts.
192.168.30.10: ssh key added
192.168.30.20: Warning: Permanently added '192.168.30.20' (RSA) to the list of known hosts.
192.168.30.20: ssh key added
192.168.30.30: Warning: Permanently added '192.168.30.30' (RSA) to the list of known hosts.
192.168.30.30: ssh key added
--ssh测试
[xifenfei@cell2 ~]$ ssh 192.168.30.10
Last login: Tue Dec 25 20:11:02 2012 from 192.168.30.30
[xifenfei@cell2 ~]$ ssh 192.168.30.30
Last login: Tue Dec 25 07:53:27 2012 from cell1
[xifenfei@cell2 ~]$ ssh 192.168.30.20
Last login: Tue Dec 25 19:30:16 2012 from 192.168.30.10

ssh等效性测试汇总

--db1节点
[xifenfei@db1 ~]$ ssh 192.168.30.10
Last login: Tue Dec 25 20:30:24 2012 from 192.168.30.20
[xifenfei@db1 ~]$ ssh 192.168.30.20
Last login: Tue Dec 25 19:33:07 2012 from 192.168.30.20
[xifenfei@db1 ~]$ ssh 192.168.30.30
Last login: Tue Dec 25 07:57:56 2012 from cell2
--cell1节点
[xifenfei@cell1 ~]$ ssh 192.168.30.20
Last login: Tue Dec 25 19:34:05 2012 from 192.168.30.30
[xifenfei@cell1 ~]$ ssh 192.168.30.30
Last login: Tue Dec 25 07:59:29 2012 from 192.168.30.30
[xifenfei@cell1 ~]$ ssh 192.168.30.10
Last login: Tue Dec 25 20:33:59 2012 from 192.168.30.30
--cell2节点
[xifenfei@cell2 ~]$ ssh 192.168.30.10
Last login: Tue Dec 25 20:35:42 2012 from 192.168.30.10
[xifenfei@cell2 ~]$ ssh 192.168.30.30
Last login: Tue Dec 25 08:00:56 2012 from cell1
[xifenfei@cell2 ~]$ ssh 192.168.30.20
Last login: Tue Dec 25 19:35:31 2012 from 192.168.30.10

到此证明三个节点之间的xifenfei用户的ssh等效配置完成,实现使用dcli完成多节点ssh等效配置

整体处理思路总结

1.确定需要配置ssh用户
--第一节点
2.编辑需要配置ssh等效连接ip列表
3.ssh-keygen -t dsa
4.dcli -k -g xifenfei.txt -l xifenfei
5.
for host in `cat xifenfei.txt`
do
scp /home/xifenfei/.ssh/id_dsa.pub ${host}:/home/xifenfei/.ssh/authorized_keys
done
6.dcli -g xifenfei.txt -l xifenfei "chmod -R 700 /home/xifenfei/.ssh"
7.dcli -g xifenfei.txt -l xifenfei "chown -R xifenfei /home/xifenfei/.ssh"
8.
for host in `cat xifenfei.txt`
 do
 scp /home/xifenfei/xifenfei.txt ${host}:/home/xifenfei/xifenfei.txt
 done
--其他节点
9. ssh-keygen -t dsa
10. dcli -k -g xifenfei.txt -l xifenfei

cellcli命令简介

cellcli用途描述

The CellCLI utility is the command-line administration tool for Exadata Cell.
CellCLI runs on each cell to enable you to manage an individual cell.
You use CellCLI to start and stop the cell, to manage cell configuration information,
to enable or disable cells, and to manage objects in the cell environment.
The command-line utility is already installed when Exadata Cell is shipped.

cellcli语法

cellcli [port_number] [-n] [-m] [-xml] [-v | -vv | -vvv] [-x] [-e command]

cellcli登录

[root@cell2 ~]# cellcli
CellCLI: Release 11.2.1.2.0 - Production on Tue Dec 25 16:07:01 PST 2012
Copyright (c) 2007, 2009, Oracle.  All rights reserved.
Cell Efficiency Ratio: 24M
CellCLI>

cellcli help

CellCLI> help
 HELP [topic]
   Available Topics:
        ALTER
        ALTER ALERTHISTORY
        ALTER CELL
        ALTER CELLDISK
        ALTER GRIDDISK
        ALTER IORMPLAN
        ALTER LUN
        ALTER THRESHOLD
        ASSIGN KEY
        CALIBRATE
        CREATE
        CREATE CELL
        CREATE CELLDISK
        CREATE FLASHCACHE
        CREATE GRIDDISK
        CREATE KEY
        CREATE THRESHOLD
        DESCRIBE
        DROP
        DROP ALERTHISTORY
        DROP CELL
        DROP CELLDISK
        DROP FLASHCACHE
        DROP GRIDDISK
        DROP THRESHOLD
        EXPORT CELLDISK
        IMPORT CELLDISK
        LIST
        LIST ACTIVEREQUEST
        LIST ALERTDEFINITION
        LIST ALERTHISTORY
        LIST CELL
        LIST CELLDISK
        LIST FLASHCACHE
        LIST FLASHCACHECONTENT
        LIST GRIDDISK
        LIST IORMPLAN
        LIST KEY
        LIST LUN
        LIST METRICCURRENT
        LIST METRICDEFINITION
        LIST METRICHISTORY
        LIST PHYSICALDISK
        LIST THRESHOLD
        SET
        SPOOL
        START
CellCLI> help list
  Enter HELP LIST <object_type> for specific help syntax.
    <object_type>:  {ACTIVEREQUEST | ALERTHISTORY | ALERTDEFINITION | CELL
                     | CELLDISK | FASHCACHE | FLASHCACHECONTENT | GRIDDISK
                     | IORMPLAN | KEY | LUN
                     | METRICCURRENT | METRICDEFINITION | METRICHISTORY
                     | PHYSICALDISK | THRESHOLD }
CellCLI> help list PHYSICALDISK
  Usage: LIST PHYSICALDISK [<name> | <filters>] [<attribute_list>] [DETAIL]
  Purpose: Displays specified attributes for physical disks.
  Arguments:
    <name>:  The name of the physical disk to be displayed.
    <filters>:  an expression which determines which physical disks should
                be displayed.
    <attribute_list>: The attributes that are to be displayed.
                      ATTRIBUTES {ALL | attr1 [, attr2]... }
  Options:
    [DETAIL]: Formats the display as an attribute on each line, with
    an attribute descriptor preceding each value.
  Examples:
    LIST PHYSICALDISK DETAIL

cellcli describe

CellCLI> DESCRIBE PHYSICALDISK
        name
        ctrlFirmware
        ctrlHwVersion
        deviceId
        diskType
        enclosureDeviceId
        errCmdTimeoutCount
        errHardReadCount
        errHardWriteCount
        errMediaCount
        errOtherCount
        errSeekCount
        errorCount
        foreignState
        hotPlugCount
        id
        lastFailureReason
        luns
        makeModel
        notPresentSince
        physicalFirmware
        physicalInsertTime
        physicalInterface
        physicalPort
        physicalRPM
        physicalSerial
        physicalSize
        physicalUseType
        sectorRemapCount
        slotNumber
        status

cellcli list

CellCLI> LIST PHYSICALDISK attributes name ,physicalSize
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/FLASH01       450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/FLASH02       450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/FLASH03       450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/FLASH04       450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/disk01        450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/disk02        450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/disk03        450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/disk04        450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/disk05        450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/disk06        450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/disk07        450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/disk08        450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/disk09        450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/disk10        450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/disk11        450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/disk12        450M
CellCLI> LIST PHYSICALDISK attributes name ,physicalSize where name like '.*FLASH.*'
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/FLASH01       450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/FLASH02       450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/FLASH03       450M
         /opt/oracle/cell11.2.1.2.0_LINUX_091102/disks/raw/FLASH04       450M

attributes后面可以加上对应的属性,”.*”表示通配符
这里通过help和describe来实现cellcli的基本操作,这里只是提供了一种处理问题的思路,通过help操作来完成cellcli的基本操作.