EXADATA EHCC初试

今天有幸见识了下EXADATA的强大功能之一EHCC(Exadata Hybrid Columnar Compression),发现压缩效果确实很让人心动,压缩效率大大超过我的预计,压缩97%左右(1-628.1875/20573)
创建模拟表T_FF_SOURCE

14:32:52 SQL> create table t_FF_source as select * from dba_objects;
Table created.
Elapsed: 00:00:00.16
14:35:54 SQL> begin
14:37:05   2  for i in 1..100000 loop
14:37:05   3     insert into t_FF_source select * from dba_jects;
14:37:05   4    commit;
14:37:05   5   end loop;
14:37:05   6  end;
14:37:05   7   /
Elapsed: 00:13:07.76
14:51:05 SQL> select count(*) from t_FF_source;
  COUNT(*)
----------
 197015655
Elapsed: 00:00:33.18
14:51:56 SQL>  col segment_name format a45 heading "Segment Name"
14:52:55 SQL> select segment_name Segment_Name
14:52:55   2  ,      segment_type             "Segment Type"
14:52:55   3  ,      round(sum(bytes)/1024/1024/1024,2)     "Size In GB"
14:52:55   4  from dba_segments
14:52:55   5  where
14:52:56   6  segment_name ='T_FF_SOURCE'
14:52:56   7  group by segment_name,segment_type
14:52:56  8  order by 1;
Segment Name                                  Segment Type       Size In GB
--------------------------------------------- ------------------ ----------
T_FF_SOURCE                                   TABLE                   20.09

创建各种情况下压缩表

--BASE
create table t_FF_c  compress  NOLOGGING as select /*+ PARALLE 24*/ * from t_FF_source;
--OLTP
create table t_FF_c_o compress for oltp NOLOGGING as select /*+ PARALLE 24 */ * from t_FF_source;
--QUERY LOW
create table t_FF_q_l  compress for query low  NOLOGGING as select /*+ PARALLE 24 */ * from t_FF_source;
--QUERY HIGH
create table t_FF_q_h compress for query high parallel 24 nologging as select /*+ PARALLE 12 */ * from t_FF_source;
--ARCHIVE LOW
create table t_FF_a_l compress for archive low parallel 24 nologging as select /*+ PARALLE 12 */ * from t_FF_source;
--ARCHIVE HIGH
create table t_FF_a_h compress for archive high  parallel 24 nologging as select  /*+ PARALLE 12 */ * from t_FF_source;

其实BASE和OLTP是数据库基本的压缩功能,该功能不仅限于EXADATA,但是后面的四种压缩就是我们所说的EHCC,也只有EXADATA用户才能够体验到.

数据压缩结果

16:19:13 SQL> select s.owner,segment_name,s.bytes/1024/1024 t_size,compress_for
16:19:20   2  from dba_segments s,dba_tables t
16:19:20   3  where s.owner=t.owner and t.table_name=s.segment_name
16:19:20   4  and s.owner='FF' and t.table_name like 'T_FF%';
OWNER                          SEGMENT_NAME                            T_SIZE COMPRESS_FOR
------------------------------ ----------------------------------- ---------- ------------
FF                             T_FF_A_L                              1244.625 ARCHIVE LOW
FF                             T_FF_SOURCE                              20573
FF                             T_FF_Q_H                              1244.875 QUERY HIGH
FF                             T_FF_A_H                              628.1875 ARCHIVE HIGH
FF                             T_FF_C                                6961.625 BASIC
FF                             T_FF_Q_L                              2799.875 QUERY LOW
FF                             T_FF_C_O                             7759.1875 OLTP

试验结果证明
1.BASE也OLTP的压缩效率差不多(可能是因为BASIC的PCTFREE为0,OLTP的PCTFREE为10)
2.在EHCC的四种压缩中:QUERY LOW相对压缩率不高,采用LZO压缩算法,但是也比ORACLE自带的压缩效果高很多
3.QUERY HIGH和ARCHIVE LOW压缩率差不多,都是使用ZLIB压缩算法
4.ARCHIVE HIGH是压缩率极高,采用Bzip2压缩算法实现.

EXADATA EM性能监控

ora程序简单测试

oracle的监控很多,朋友推荐的ora工具,测试了下,发现确实很好用也很强大
ora使用帮助

[oracle@xifenfei ~]$ ./ora
Syntax Error
  Usage: ora [-u user] [-i instance#] <command> [<arguments>]
    General
      -u user/pass                  use USER/PASS to log in
      -i instance#                  append # to ORACLE_SID
      -sid <sid>                    set ORACLE_SID to sid
      -top #                        limit some large queries to on # rows
      - repeat <interval> <count|forever> <ora command>
                                    Repeat an coomand <count> time
                                    Sleep <interval> between two calls
    Command are:
      - execute:                    cursors currently being executed
      - longops:                    run progression monitor
      - sessions:                   currently open sessions
      - stack <os_pid>              get process stack using oradebug
      - cursors [all] <match_str>:  [all] parsed cursors
      - sharing <sql_id>:           print why cursors are not shared
      - events [px]:                events that someone is waiting for
      - ash <minutes_from_now>
            [duration]
            [-f <file_name>]        active session history for specified period
                                    e.g. 'ash 30' to display from [now - 30min]
                                          to [now]
                                    e.g. 'ash 30 10 -f foo.txt' to display a 10
                                          minutes period from [now - 30min]
                                          and store the result in file foo.txt
      - ash_wait_graph <minutes_from_now>
            [duration]
            [-f <file_name>]        PQ event wait graph using ASH data
                                    Arguments are the same as for ash
                                    except that the output must be shown
                                    with the mxgraph tool
      - ash_sql <sql_id>            Show all ash rows group by sampli_time and
                                    event for the specified sql_id
      - [-u <user/passwd>] degree   degree of objects for a given user
      - [-u <user/passwd>] colstats stats for each table, column
      - [-u <user/passwd>] tabstats stats for each table
      - params [<pattern>]:         view all parameters, even hidden ones
      - snap:                       view all snapshots status
      - bc:                         view contents of buffer cache
      - temp:                       view used space in temp tbs
      - asm:                        Show asm space/free space
      - space [<tbs>]:              view used/free space in a given tbs
      - binds    <sql_id> :         display bind capture information for
                                    specified cursor
      - fulltext  <sql_id> :        display the entire SQL text of the
                                    specified statement
      - last_sql_hash [<sid>]:      hash value of the last styatement executed
                                    by the specified sid. If no sid speficied,
                                    return the last hash_value of user sessions
      - openv    <sql_id>
                [<pattern>]:        display optimizer env parameters for
                                    specified cursor
      - plan     <sql_id>  [<fmt>]: get explain plan of a particular cursor
      - pxplan   <sql_id> :         get explain plan of a particular cursor and
                                    all connected cursor slave SQL
      - wplan  <sql_id>  [<fmt>]:   get explain plan with work area information
      - pxwplan   <sql_id> :        get explain plan with work area information
                                    of a particular cursor and all connected
                                    cursor slave SQL
      - eplan   <sql_id>  [<fmt>]:  get explain plan with execution statistics
      - pxeplan   <sql_id> :        get explain plan with execution statistics
                                    of a particular cursor and all connected
                                    cursor slave SQL
      - gplan    <sql_id> :         get graphical explain plan of a particular
                                    cursor using dot specification
      - webplan  <sql_id>           get graphical explain plan of a particular
                 [/<child_number>]  cursor using gdl specification
                 [<decorate>]:      optional: child_number, default is zero.
                                    optional: decorate to print further node
                                    information. default is 0,
                                    1 => print further node information such as
                                     cost, filter_predicates etc.
                                    2 => in addition to the above, print
                                     row vector information
                                    sample usage:
                                    # ora webplan 4019453623
                                    print more information (decorate 1)
                                    # ora webplan 4019453623/1 1
                                    more information, overload! (decorate 2)
                                    # ora webplan 4019453623/1 2
                                    using sql_id along with child number
                                    instead of hash value
                                    # ora webplan aca4xvmz0rzup/3 1
      - hash_to_sqlid  <sql_id> :   get the sql_id of the cursor given its hash
                                    value
      - sqlid_to_hash <sql_id>:     get the hash value of the cursor given its
                                    (unquoted) sql_id
      - exptbs:                     generate export tablespace script
      - imptbs:                     generate import tablespace script
      - smm [limited]:              SQL memory manager stats for active
                                    workareas
      - onepass:                    Run an ora wplan on all one-pass cursors
      - mpass:                      Run an ora wplan on all multi-pass cursors
      - pga:                        tell how much pga memory is used
      - pga_detail <os_pid>|
                   -mem <size_mb>:  Gives details on how PGA memory is consumed
                                    by a process (given its os PID) or by the
                                    set of precesses consuming more than
                                    <size_mb> MB of PGA memory (-mem option)
      - pgasnap [<snaptab>]         Snapshot the pga advice stats
      - pgaadv  [-s [<snaptab>]]
                [-o graphfile]
                [-m min_size]:      generate a graph from v
                                    and display it or store it in a
                                    file if the -o option is used.
                                      -s [<snaptab>] to diff with a
                                       previous snapshot (see pgasnap cmd)
                                      -o [graphfile] to store the result
                                       in a file instead of displaying it
                                      -m [min_size] only consider
                                       workareas with a minimum size
      - pgaadvhist [-f <f_min>
                       [<f_max>]]   display the advice history for all
                                    factors or for factor between
                                    f_min and f_max
      - sga:                        tell how much sga memory is used
      - sga_stats:                  tell how sga is dynamically used
      - sort_usage:                 tell how temp tablespace is used in detail
      - sgasnap [<snaptab>]         Snapshot the sga advice stats
      - sgaadv  [-s [<snaptab>]]
                [-o graphfile]      generate a graph from v
                                    and v and store it in a
                                    file if the -o option is used.
                                      -s [<snaptab>] to diff with a
                                       previous snapshot (see sgasnap cmd)
                                      -o [graphfile] to store the result
                                       in a file instead of displaying it
      - process [<min_mb>]:         display process info with pga memory
      - version:                    display Oracle version number
      - cur_mem [ <sql_id> ]        display the memory used for a
                                    given or all cursors
      - shared_mem [ <sql_id> ]     detailed dump of cursor shared mem
                                    allocations
      - runtime_mem [ <sql_id> ]    detailed dump of cursor runtime
                                    memory allocations
      - all_mem [ <sql_id> ]          do all of the memory dumps
      - pstack <pid>|all
               [<directory>]        run pstack on specified process
                                    (or all if 'all' specified) and store
                                    files in specified dir ( when
                                    not specified)
      - idxdesc [username]
                <tabName>           list all indexes for a given user or
                                    for a given user and table
      - segsize [username]
                <objName>           list size of all objects(segments) for
                                    given user for a given user and object
      - tempu <username>            list temporary ts usage of all users or
                                    for a given user
      - sqlstats [ <sql_id> ]       list sql execution stats (like
                                    buffer_gets, phy. reads etc)
                                    for a given sql_id/hash_value of statement
      - optstats [username]         list optimizer stats for all tables stored
                 <tabname>          in dictionary for a given user or for a
                                    given user and table
      - userVs                      list all user Views (user_tables,
                                    user_indexes etc)
      - fixedVs                     list all V$ Views
      - fixedXs                     list all X$ Views
      - px_processes                list all px processes (QC and slaves)
      - cursor_summary              summarize stats about (un)pinned cursors
      - rowcache                    summarizes row cache statistics
      - monitor_list                lists all the statements that have been
                                    monitored
      - monitor :              wraps dbms_sqltune.report_sql_monitor().
                                    Directly passe the arguments to the PL/SQL
                                    procedure. Args are:
                                      sql_id, session_id, session_serial,
                                      sql_exec_start, sql_exec_id, inst_id,
                                      instance_id_filter, parallel_filter,
                                      report_level, type.
                                    Examples:
                                      - monitor xml   shows XML report
                                      - monitor   show last monitored stmt
                                      - monitor   sql_id=>'8vz99cy9bydv8',
                                                  session_id=>105 will
                                        show monitor info for sql_id
                                        8vz99cy9bydv8 and session_id 105
                                    Use simply ora monitor 8vz99cy9bydv8
                                    to display monitoring information for
                                    sql_id 8vz99cy9bydv8.
                                    Syntax for parallel filters is:
                                    [qc][servers(<svr_grp>[,] <svr_set>[,]
                                                 <srv_num>)]
                                    Use /*+ monitor */ to force monitoring.
      - monitor_old [ash_all] [<sqlid>]
                [qc|<slave_grp#> [<slave_set#> [<slave#>]]]
                                    Old version of SQL monitoring, use a
                                    SQL query versus the report_sql_monitor()
                                    package. Display monitoring info for the
                                    LAST execution of the specified cursor.
                                    Cursor response time needs to be at
                                    least 5s for monitoring to start (use the
                                    monitor hint to force monitoring). Without
                                    any parameter, will display monitoring info
                                    for the last cursor that was monitored
                                    - ash_all will aggregate ash data over all
                                     executions of the cursor (useful for short
                                      queries that are executed many times).
                                    If parallel:
                                    - qc to see only data for qc
                                    - slave_grp# to see only data for one
                                      parallelizer
                                    - slave_grp# + slave_set# to see only data
                                      for one slave set of one parallelizer,
                                    - slave_grp# + slave_set# + slave# to see
                                      data only for the specified slave
      - sql_task [progress | interrupt <task_name> | history <#execs> |
                  report <task_name> <section> <exec_name> ]
                                progress: progress monitoring for executing
                                          sql tasks
                                interrupt: interrupt an executing sql task
                                history:   print a history of last n executions
                                report:    get a sql tune report
      - sh                         Run a shell command. E.g.
                                   ora repeat 5 10 sh 'ps -edf | grep DESC'
  Memory: The detailed memory dumps need to have events set to work.
          The events bellow can be added to the init.ora file
  event="10277 trace name context forever, level 10" # mutable mem
  event="10235 trace name context forever, level 4"  # shared mem
  NOTE
  ====
    - Set environment variable ORA_USE_HASH to 1 to get SQL hash values
      instead of SQL ids
    - Set environment variable DBUSER to change default connect string which
      is "/ as sysdba"
    - Set environment variable ORA_TMP to the default temp directory (default
      if /tmp when not set)

数据库版本

[oracle@xifenfei ~]$ ./ora version
Oracle version: 11.2.0.3.0

正在执行sql

[oracle@xifenfei ~]$ ./ora execute
SQL_ID              EXEC SQL_TEXT
------------------ ----- ---------------------------------------------------------------------------
g6gu1n3x0h1h4          1 select streams_pool_size_for_estimate s,
                         streams_pool_size_factor * 100 f,           estd_spill_time +
                         estd_unspill_time, 0  from v$streams_pool_advice
5yv7yvjgjxugg          1 select TIME_WAITED_MICRO from V$SYSTEM_EVENT  where event = 'Shared IO
                         Pool Memory'
0rc4km05kgzb9          1 select 1 from obj$ where name='DBA_QUEUE_SCHEDULES'

当前的会话

[oracle@xifenfei ~]$ ./ora sessions
       SID    SERIAL# CL_PID     PID        USERNAME             TYPE       SERVER    PROGRAM                                          SQL_ID
---------- ---------- ---------- ---------- -------------------- ---------- --------- ------------------------------------------------ --------
ACTION
----------------------------------------------------------------
         1          1 706        706                             BACKGROUND DEDICATED oracle@xifenfei (PMON)
       126          1 710        710                             BACKGROUND DEDICATED oracle@xifenfei (PSP0)
         2          1 714        714                             BACKGROUND DEDICATED oracle@xifenfei (VKTM)
       127          1 720        720                             BACKGROUND DEDICATED oracle@xifenfei (GEN0)
         3          1 724        724                             BACKGROUND DEDICATED oracle@xifenfei (DIAG)
       128          1 728        728                             BACKGROUND DEDICATED oracle@xifenfei (DBRM)
         4          1 732        732                             BACKGROUND DEDICATED oracle@xifenfei (DIA0)
       129          1 736        736                             BACKGROUND DEDICATED oracle@xifenfei (MMAN)
         5          1 740        740                             BACKGROUND DEDICATED oracle@xifenfei (DBW0)
       130          1 744        744                             BACKGROUND DEDICATED oracle@xifenfei (LGWR)
         6          1 748        748                             BACKGROUND DEDICATED oracle@xifenfei (CKPT)
       131          1 752        752                             BACKGROUND DEDICATED oracle@xifenfei (SMON)
         7          1 756        756                             BACKGROUND DEDICATED oracle@xifenfei (RECO)
       132          1 760        760                             BACKGROUND DEDICATED oracle@xifenfei (MMON)
         8          1 764        764                             BACKGROUND DEDICATED oracle@xifenfei (MMNL)
       125         87 12732      12732                           BACKGROUND DEDICATED oracle@xifenfei (W000)
KTSJ Slave
        15        141 13132      13136      SYS                  USER       DEDICATED sqlplus@xifenfei (TNS V1-V3)                     d5zj45xcq95d3
         9          5 819        819                             BACKGROUND DEDICATED oracle@xifenfei (ARC0)
       135          5 823        823                             BACKGROUND DEDICATED oracle@xifenfei (ARC1)
        10          1 827        827                             BACKGROUND DEDICATED oracle@xifenfei (ARC2)
       136          3 831        831                             BACKGROUND DEDICATED oracle@xifenfei (ARC3)
        12          1 835        835                             BACKGROUND DEDICATED oracle@xifenfei (QMNC)
QMON Coordinator
       133         11 855        855                             BACKGROUND DEDICATED oracle@xifenfei (Q000)
QMON Slave
        14          7 896        896                             BACKGROUND DEDICATED oracle@xifenfei (SMCO)
KTSJ Coordinator
        17          3 859        859                             BACKGROUND DEDICATED oracle@xifenfei (Q001)
QMON Slave

sql执行情况

[oracle@xifenfei ~]$ ./ora ash_sql d5zj45xcq95d3
SAMPLE_TIME                              ACTIVITY                       P1                             P2                   NAME         NB
---------------------------------------- ------------------------------ ------------------------------ -------------------- ---------- ----
13-JAN-12 03.07.05.299 AM                CPU                            -                              -                                  1

完整sql语句

[oracle@xifenfei ~]$ ./ora fulltext d5zj45xcq95d3
SQL_ID             SQL_FULLTEXT
------------------ ---------------------------------------------------------------------------
d5zj45xcq95d3          select s.sid, s.serial#, s.PROCESS cl_pid, p.spid pid, s.username,
                              s.type, s.server, s.PROGRAM, sql_id sql_id, s.action
                       from v$session s, v$process p
                       where s.PADDR = p.addr

sql执行计划

[oracle@xifenfei ~]$ ./ora plan d5zj45xcq95d3
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------
SQL_ID  d5zj45xcq95d3, child number 0
-------------------------------------
    select s.sid, s.serial#, s.PROCESS cl_pid, p.spid pid, s.username,
          s.type, s.server, s.PROGRAM, sql_id sql_id, s.action     from
v$session s, v$process p     where s.PADDR = p.addr
Plan hash value: 1456042965
----------------------------------------------------------------------------------
| Id  | Operation                 | Name            | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                 |       |       |     1 (100)|
|   1 |  NESTED LOOPS             |                 |     1 |   264 |     0   (0)|
|   2 |   NESTED LOOPS            |                 |     1 |   251 |     0   (0)|
|   3 |    MERGE JOIN CARTESIAN   |                 |     5 |   350 |     0   (0)|
|*  4 |     FIXED TABLE FULL      | X$KSUPR         |     1 |    44 |     0   (0)|
|   5 |     BUFFER SORT           |                 |   100 |  2600 |     0   (0)|
|   6 |      FIXED TABLE FULL     | X$KSLWT         |   100 |  2600 |     0   (0)|
|*  7 |    FIXED TABLE FIXED INDEX| X$KSUSE (ind:1) |     1 |   181 |     0   (0)|
|*  8 |   FIXED TABLE FIXED INDEX | X$KSLED (ind:2) |     1 |    13 |     0   (0)|
----------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$573A9BEE
   4 - SEL$573A9BEE / X$KSUPR@SEL$5
   6 - SEL$573A9BEE / W@SEL$3
   7 - SEL$573A9BEE / S@SEL$3
   8 - SEL$573A9BEE / E@SEL$3
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(("INST_ID"=USERENV('INSTANCE') AND BITAND("KSSPAFLG",1)<>0))
   7 - filter(("S"."INST_ID"=USERENV('INSTANCE') AND
              BITAND("S"."KSSPAFLG",1)<>0 AND BITAND("S"."KSUSEFLG",1)<>0 AND
              "S"."KSUSEPRO"="ADDR" AND "S"."INDX"="W"."KSLWTSID"))
   8 - filter("W"."KSLWTEVT"="E"."INDX")
Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - "KSUPRPID"[VARCHAR2,24], "S"."INDX"[NUMBER,22],
       "S"."KSSPATYP"[NUMBER,22], "S"."KSUUDLNA"[VARCHAR2,30],
       "S"."KSUSESER"[NUMBER,22], "S"."KSUSEFLG"[NUMBER,22],
       "S"."KSUSEPID"[VARCHAR2,24], "S"."KSUSEPNM"[VARCHAR2,48],
       "S"."KSUSESQI"[VARCHAR2,13], "S"."KSUSEACT"[VARCHAR2,64]
   2 - "KSUPRPID"[VARCHAR2,24], "W"."KSLWTEVT"[NUMBER,22],
       "S"."INDX"[NUMBER,22], "S"."KSSPATYP"[NUMBER,22],
       "S"."KSUUDLNA"[VARCHAR2,30], "S"."KSUSESER"[NUMBER,22],
       "S"."KSUSEFLG"[NUMBER,22], "S"."KSUSEPID"[VARCHAR2,24],
       "S"."KSUSEPNM"[VARCHAR2,48], "S"."KSUSESQI"[VARCHAR2,13],
       "S"."KSUSEACT"[VARCHAR2,64]
   3 - "ADDR"[RAW,4], "KSUPRPID"[VARCHAR2,24], "W"."KSLWTSID"[NUMBER,22],
       "W"."KSLWTEVT"[NUMBER,22]
   4 - "ADDR"[RAW,4], "INST_ID"[NUMBER,22], "KSSPAFLG"[NUMBER,22],
       "KSUPRPID"[VARCHAR2,24]
   5 - (#keys=0) "W"."KSLWTSID"[NUMBER,22], "W"."KSLWTEVT"[NUMBER,22]
   6 - "W"."KSLWTSID"[NUMBER,22], "W"."KSLWTEVT"[NUMBER,22]
   7 - "S"."INDX"[NUMBER,22], "S"."INST_ID"[NUMBER,22],
       "S"."KSSPAFLG"[NUMBER,22], "S"."KSSPATYP"[NUMBER,22],
       "S"."KSUUDLNA"[VARCHAR2,30], "S"."KSUSEPRO"[RAW,4],
       "S"."KSUSESER"[NUMBER,22], "S"."KSUSEFLG"[NUMBER,22],
       "S"."KSUSEPID"[VARCHAR2,24], "S"."KSUSEPNM"[VARCHAR2,48],
       "S"."KSUSESQI"[VARCHAR2,13], "S"."KSUSEACT"[VARCHAR2,64]
   8 - "E"."INDX"[NUMBER,22]

sga和pga信息

[oracle@xifenfei ~]$ ./ora pga
NAME                        PID   USED(KB)  ALLOC(KB)   MAX_ALLOC(KB)    MAP_SIZE(KB)
-------------------------------------------------------------------------------------
Total                                 102M       119M           26173              0M
[oracle@xifenfei ~]$ ./ora sga
POOL         NAME                          SIZE_KB
------------ -------------------------- ----------
large pool   PX msg pool                       480
large pool   free memory                      3616
java pool    free memory                      4096
large pool   (total)                          4096
java pool    (total)                          4096
shared pool  free memory                     14149
shared pool  (total)                         14149
Total                                        22341
8 rows selected.
[oracle@xifenfei ~]$ ./ora sga_stats
SGA DYNAMIC COMPNENTS
COMPONENT                                          CURRENT_SIZE_MB MIN_SIZE_MB MAX_SIZE_MB LAST_OPER_TYP LAST_OPER
-------------------------------------------------- --------------- ----------- ----------- ------------- ---------
shared pool                                                     84          84          84 STATIC
large pool                                                       4           4           4 STATIC
java pool                                                        4           4           4 STATIC
streams pool                                                     0           0           0 STATIC
DEFAULT buffer cache                                            72          72          72 INITIALIZING
KEEP buffer cache                                                0           0           0 STATIC
RECYCLE buffer cache                                             0           0           0 STATIC
DEFAULT 2K buffer cache                                          0           0           0 STATIC
DEFAULT 4K buffer cache                                          0           0           0 STATIC
DEFAULT 8K buffer cache                                          0           0           0 STATIC
DEFAULT 16K buffer cache                                         0           0           0 STATIC
DEFAULT 32K buffer cache                                         0           0           0 STATIC
Shared IO Pool                                                   0           0           0 STATIC
ASM Buffer Cache                                                 0           0           0 STATIC

ora的功能非常强大,其实本质都是通过sql语句查询实现,可以通过数据库做10046来捕获相关sql.学习oracle,工具只是提高大家工作的效率,而不是刻意去追求工具本身

bbed_wrap脚本获取数据块内容

bbed的功能很强大,可以通过bbed_wrap来获得数据块记录,相当用途:抢救坏块中的数据
环境准备

[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Thu Jan 12 18:29:50 2012
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> conn chf/xifenfei
Connected.
SQL> create table t_xifenfei
  2  as
  3  select object_id,object_name from dba_objects where rownum<20;
Table created.
SQL>  select file_id,block_id,block_id+blocks-1
  2          from dba_extents
  3   where segment_name ='T_XIFENFEI' AND owner='CHF';
   FILE_ID   BLOCK_ID BLOCK_ID+BLOCKS-1
---------- ---------- -----------------
         4        680               687
SQL> alter system checkpoint;
System altered.
--查询记录
SQL> col object_name for a20
SQL> select   object_id,object_name,
  2  dbms_rowid.rowid_relative_fno(rowid)rel_fno,
  3  dbms_rowid.rowid_block_number(rowid)blockno,
  4  dbms_rowid.rowid_row_number(rowid) rowno
  5  from chf.t_xifenfei;
 OBJECT_ID OBJECT_NAME             REL_FNO    BLOCKNO      ROWNO
---------- -------------------- ---------- ---------- ----------
        20 ICOL$                         4        683          0
        46 I_USER1                       4        683          1
        28 CON$                          4        683          2
        15 UNDO$                         4        683          3
        29 C_COBJ#                       4        683          4
         3 I_OBJ#                        4        683          5
        25 PROXY_ROLE_DATA$              4        683          6
        41 I_IND1                        4        683          7
        54 I_CDEF2                       4        683          8
        40 I_OBJ5                        4        683          9
        26 I_PROXY_ROLE_DATA$_1          4        683         10
        17 FILE$                         4        683         11
        13 UET$                          4        683         12
         9 I_FILE#_BLOCK#                4        683         13
        43 I_FILE1                       4        683         14
        51 I_CON1                        4        683         15
        38 I_OBJ3                        4        683         16
         7 I_TS#                         4        683         17
        56 I_CDEF4                       4        683         18
19 rows selected.

bbed参数配置

[oracle@xifenfei ~]$ more bbed_file
         1 /u01/oracle/oradata/ora11g/system01.dbf
         2 /u01/oracle/oradata/ora11g/sysaux01.dbf
         3 /u01/oracle/oradata/ora11g/undotbs01.dbf
         4 /u01/oracle/oradata/ora11g/users01.dbf
         5 /u01/oracle/oradata/ora11g/dbfs01.dbf
[oracle@xifenfei ~]$ more bbed.par
blocksize=8192
listfile=/home/oracle/bbed_file
mode=browse
SILENT=yes
PASSWORD=blockedit

bbed_wrap脚本执行

[oracle@xifenfei ~]$ ./bbed_wrap.sh 4 683 "/rn2cntn"
There are 19 rows in block 683 on file 4
" 20 "," ICOL$"
" 46 "," I_USER1"
" 28 "," CON$"
" 15 "," UNDO$"
" 29 "," C_COBJ#"
" 3 "," I_OBJ#"
" 25 "," PROXY_ROLE_DATA$"
" 41 "," I_IND1"
" 54 "," I_CDEF2"
" 40 "," I_OBJ5"
" 26 "," I_PROXY_ROLE_DATA$_1"
" 17 "," FILE$"
" 13 "," UET$"
" 9 "," I_FILE#_BLOCK#"
" 43 "," I_FILE1"
" 51 "," I_CON1"
" 38 "," I_OBJ3"
" 7 "," I_TS#"
" 56 "," I_CDEF4"
--和我们查询的结果完全一致

创建DBFS

DBFS(Oracle Database File System)就是Oracle数据库11gR2中提供的能够在Linux和Solaris操作系统中将Oracle数据库当成文件系统来使用的功能.在DBFS内部,文件是以SecureFiles LOBs(对比与以前的BasicFiles LOBs)的形式存储在数据表中.这个功能第一个是存储图片或者文档,第二个功能就是在RAC或者XD中部署OGG是一个不错的选择.
安装fuse相关包

[root@xifenfei ~]# mount /dev/cdrom /media
mount: block device /dev/cdrom is write-protected, mounting read-only
[root@xifenfei ~]# cd /media/Server
[root@xifenfei Server]# ls fuse*
fuse-2.7.4-8.0.1.el5.x86_64.rpm      fuse-devel-2.7.4-8.0.1.el5.x86_64.rpm  fuse-libs-2.7.4-8.0.1.el5.x86_64.rpm
fuse-devel-2.7.4-8.0.1.el5.i386.rpm  fuse-libs-2.7.4-8.0.1.el5.i386.rpm
[root@xifenfei Server]# rpm -ivh fuse-libs-2.7.4-8.0.1.el5.x86_64.rpm
warning: fuse-libs-2.7.4-8.0.1.el5.x86_64.rpm: Header V3 DSA signature: NOKEY, key ID 1e5e0159
Preparing...                ########################################### [100%]
   1:fuse-libs              ########################################### [100%]
[root@xifenfei Server]# rpm -ivh  fuse-devel-2.7.4-8.0.1.el5.x86_64.rpm
warning: fuse-devel-2.7.4-8.0.1.el5.x86_64.rpm: Header V3 DSA signature: NOKEY, key ID 1e5e0159
Preparing...                ########################################### [100%]
   1:fuse-devel             ########################################### [100%]
[root@xifenfei Server]# rpm -ivh  fuse-2.7.4-8.0.1.el5.x86_64.rpm
warning: fuse-devel-2.7.4-8.0.1.el5.x86_64.rpm: Header V3 DSA signature: NOKEY, key ID 1e5e0159
Preparing...                ########################################### [100%]
   1:fuse                   ########################################### [100%]

系统配置

[root@xifenfei Server]# cd /
[root@xifenfei /]# mkdir dbfs
[root@xifenfei /]# chown ora11g:oinstall dbfs
[root@xifenfei /]# ls -l|grep dbfs
drwxr-xr-x   2 ora11g oinstall  4096 Sep  1 16:41 dbfs
[root@xifenfei /]# echo "/usr/local/lib" >> /etc/ld.so.conf.d/usr_local_lib.conf
[root@xifenfei /]# export ORACLE_HOME=/u01/oracle11
[root@xifenfei /]# ln -s $ORACLE_HOME/lib/libclntsh.so.11.1 /usr/local/lib/libclntsh.so.11.1
[root@xifenfei /]# ln -s $ORACLE_HOME/lib/libnnz11.so /usr/local/lib/libnnz11.so
[root@xifenfei /]# ln -s /lib64/libfuse.so.2 /usr/local/lib/libfuse.so.2
[root@xifenfei /]# cd /usr/local/lib
[root@xifenfei lib]# ls -l
total 0
lrwxrwxrwx 1 root root 35 Sep  1 16:45 libclntsh.so.11.1 -> /u01/oracle11/lib/libclntsh.so.11.1
lrwxrwxrwx 1 root root 19 Sep  1 16:46 libfuse.so.2 -> /lib64/libfuse.so.2
lrwxrwxrwx 1 root root 29 Sep  1 16:46 libnnz11.so -> /u01/oracle11/lib/libnnz11.so
[root@xifenfei lib]# ldconfig
[root@xifenfei lib]# chmod +x /usr/bin/fusermount
[root@xifenfei lib]# ls -l /usr/bin/fusermount
lrwxrwxrwx 1 root root 15 Sep  1 16:37 /usr/bin/fusermount -> /bin/fusermount
[root@xifenfei lib]#  ls -l /bin/fusermount
-rwsr-x--x 1 root fuse 23544 Oct 18  2011 /bin/fusermount

相关表空间/用户配置

SQL> create tablespace dbfs_ts
  2  datafile '/u01/oradata/ora11g/xifenfei01.dbf'
  3  size 20m autoextend on next 10m maxsize 30g;
Tablespace created.
SQL> create user dbfs  identified by dbfs
  2   default tablespace dbfs_ts
  3   quota unlimited on dbfs_ts;
User created.
SQL> grant create session, resource, create view, dbfs_role to dbfs ;
Grant succeeded.

创建filesystem

[ora11g@xifenfei admin]$ cd $ORACLE_HOME/rdbms/admin
[ora11g@xifenfei admin]$ sqlplus dbfs/dbfs@ora11g
SQL*Plus: Release 11.2.0.3.0 Production on Sat Sep 1 16:43:04 2012
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> @dbfs_create_filesystem.sql dbfs_ts my_dbfs
No errors.
--------
CREATE STORE:
begin dbms_dbfs_sfs.createFilesystem(store_name => 'FS_MY_DBFS', tbl_name =>
'T_MY_DBFS', tbl_tbs => 'dbfs_ts', lob_tbs => 'dbfs_ts', do_partition => false,
partition_key => 1, do_compress => false, compression => '', do_dedup => false,
do_encrypt => false); end;
--------
REGISTER STORE:
begin dbms_dbfs_content.registerStore(store_name=> 'FS_MY_DBFS', provider_name
=> 'sample1', provider_package => 'dbms_dbfs_sfs'); end;
--------
MOUNT STORE:
begin dbms_dbfs_content.mountStore(store_name=>'FS_MY_DBFS',
store_mount=>'my_dbfs'); end;
--------
CHMOD STORE:
declare m integer; begin m := dbms_fuse.fs_chmod('/my_dbfs', 16895); end;
No errors.

挂载dbfs

[ora11g@xifenfei ~]$ more /home/ora11g/xifenfei_pwd
dbfs
[ora11g@xifenfei ~]$ nohup dbfs_client dbfs@ora11g /dbfs <xifenfei_pwd &
[1] 3694
[ora11g@xifenfei ~]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      3.9G  3.2G  462M  88% /
/dev/sda1              99M   24M   71M  25% /boot
tmpfs                1002M  184M  818M  19% /dev/shm
/dev/sdb1              20G  8.9G  9.9G  48% /u01
df: `/dbfs': Resource temporarily unavailable

查询mos发现 OS 2.6.32-100.26.2.el5 to: 2.6.32-300.10.1.el5uek (Linux UEK Kernel)会出现该问题,解决方法是升级Kernel或者不使用UEK

[ora11g@xifenfei ~]$ uname -a
Linux xifenfei 2.6.32-300.10.1.el5uek #1 SMP Wed Feb 22 17:37:40 EST 2012 x86_64 x86_64 x86_64 GNU/Linux

不幸刚好中招,现在升级是不太可能的事情,只能使用其他kernel来启动系统(下图选择第二个)

重新挂载dbfs并且测试

[ora11g@xifenfei ~]$ uname -a
Linux xifenfei 2.6.18-308.el5 #1 SMP Sat Feb 25 12:40:07 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
[ora11g@xifenfei ~]$ nohup dbfs_client dbfs@ora11g /dbfs <xifenfei_pwd &
[1] 3694
[ora11g@xifenfei ~]$ nohup: appending output to `nohup.out'
[ora11g@xifenfei ~]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      3.9G  3.2G  462M  88% /
/dev/sda1              99M   24M   71M  25% /boot
tmpfs                1006M  184M  822M  19% /dev/shm
/dev/sdb1              20G  8.9G  9.9G  48% /u01
dbfs-dbfs@ora11g:/     19M  120K   19M   1% /dbfs
[ora11g@xifenfei ~]$ cd /dbfs
[ora11g@xifenfei dbfs]$ ls
my_dbfs
[ora11g@xifenfei dbfs]$ cd my_dbfs/
[ora11g@xifenfei my_dbfs]$ ls
[ora11g@xifenfei my_dbfs]$ cat /etc/passwd>xifenfei.chf
[ora11g@xifenfei my_dbfs]$ ll
total 2
-rw-r--r-- 1 ora11g oinstall 1736 Sep  1 21:05 xifenfei.chf

卸载dbfs

[ora11g@xifenfei ~]$ fusermount -u /dbfs
[ora11g@xifenfei ~]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      3.9G  3.2G  462M  88% /
/dev/sda1              99M   24M   71M  25% /boot
tmpfs                1006M  184M  822M  19% /dev/shm
/dev/sdb1              20G  8.9G  9.9G  48% /u01

删除filesystem

cd $ORACLE_HOME/rdbms/admin
sqlplus dbfs_user/dbfs_user
SQL> @dbfs_drop_filesystem.sql my_dbfs

解决因/etc/fstab错误导致系统不能启动故障

发现如果人品不好做试验都是问题很多,晚上又把fstab给写错了,导致系统不能起来,因为当时处理该故障未截图,后续在网上找了几张图片,大体说明处理思路
系统启动报 filesystems 失败,输入root密码进入repair filesystem模式

尝试修改 /etc/fstab 发现系统是read-only模式

重新mount -n -o remount,rw /重新mount文件系统

重新修改/etc/fstab,除掉错误记录,然后使用init 6/reboot命令重启系统

安装ORACLE db /tmp空间不足解决办法

因测试需要装一个ORACLE 11G,在安装检测阶段报下图错误

详细信息

Free Space: xifenfei:/tmp - This is a prerequisite condition to test
whether sufficient free space is available in the file system.
Error:
 - 
PRVF-7501 : Sufficient space is not available at location "/tmp"
on node "xifenfei" [Required space = 1GB ]  
- Cause:  Not enough free space at location specified.  - Action: 
Free up additional space or select another location.
Expected Value
 : 1GB
Actual Value
 : 238MB

错误提示很明显ORACLE安装过程需要1G的临时空间,但是现在/tmp只有238M,空间明显不足,是的oracle检测失败,为了安装过程不出意外,决定分析并解决该问题

磁盘空间使用情况

[ora11g@xifenfei ~]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      3.9G  3.3G  238M  93% /
/dev/sda1              99M   24M   71M  25% /boot
tmpfs                1002M     0 1002M   0% /dev/shm
/dev/sdb1              20G  7.8G   11G  42% /u01

这里可以看出来/tmp没有另外的分配分区,而是挂载在/下面,也就是说,/tmp最多使用的空间就是/dev/mapper/VolGroup00-LogVol00分区能够使用的最大空间,也就是238M,证明ORACLE的检查程序说的是事实。

解决该问题
1.建立新tmp目录

[root@xifenfei ora11g]# mkdir /u01/tmp
[root@xifenfei ora11g]# chown root:root /u01/tmp
[root@xifenfei ora11g]# chmod 1777 /u01/tmp

2.设置数据库用户变量

vi db_home/.bash_profile
export TEMP=/u01/tmp
export TMPDIR=/u01/tmp
[ora11g@xifenfei ~]$ env|grep TMP
TMPDIR=/u01/tmp
[ora11g@xifenfei ~]$ env|grep TEMP
TEMP=/u01/tmp

3.重新运行runInstaller
4.安装完成清理相关/u01/tmp 和相关环境变了,让数据库使用系统默认(根据实际情况处理)

ORA-600和ORA-7445错误文章汇总

ORA-600和ORA-7445错误都是很多dba忌讳的错误,这里对本blog截止于:2012年09月15日的相关类此错误进行一个简单汇总,方便大家查阅.同时感谢“广州-紫恒”朋友在百忙中帮我整理出来.
ORA-600错误集合
ORA-600 [12235]
ORA-00600 [2662]
ORA-00600[4454]
ORA-00600[KSSADP1]
重现ORA-600[4000]异常
ORA-00600[4194]故障解决
ORA-00600[ktspNextL1:4]
ORA-600[4194]/[4193]解决
TOAD导致ORA-00600[17281]
ORA-00600[729]分析和处理方法
使用bbed解决ORA-00600[2662]
记录一次ORA-600[13013]处理过程
因为高版本引起ORA-00600[17059]
记录另一起ORA-00600[13013]处理
异常断电导致current redo损坏处理
记录一次ORA-00600[kdsgrp1]分析
异常断电导致current redo损坏处理
记录一次ORA-00600[2252]故障解决
因使用OEM引起ORA-00600[12761]
10.2.0.5出现ORA-00600[kcblasm_1]
ORA-00600[kgscLogOff-notempty]
通过bbed解决ORA-00600[4000]案例
ORA-00600 [ktbdchk1: bad dscn] 解决
11G RAC库 ORA-00600[ktubko_1]错误
收集统计信息出现ORA-00600[ksxprqfre3]
ORA-607/ORA-600[4194]不一定是重大灾难
创建控制文件遭遇ORA-00600[3753]故障解决
控制文件异常导致ORA-00600[kccsbck_first]
数据库报ORA-00607/ORA-00600[4194]错误
ORA-00600[kcratr_nab_less_than_odr]故障解决
PL/SQL Developer编译过程引起ora-600[15419]
ORA-07445[kslgetl()+120]/ORA-00108错误解决
双机mount数据库出现ORA-00600[kccsbck_first]
使用bbed解决ORA-00607/ORA-00600[4194]故障
通过bbed模拟ORA-00607/ORA-00600[4194]故障
truncate table强制终止导致ORA-00600[ktspfundo-2]
客户端版本导致ORA-00600[kssadd_stage: null parent]
动态修改PGA_AGGREGATE_TARGET 导致ORA-600[723]
ORA-00600[kcratr1_lostwrt]/ORA-00600[3020]错误恢复
ORA-600 [LibraryCacheNotEmptyOnClose] on shutdown
ORA-600[6749] 发生在 SYSMAN.MGMT_METRICS_RAW表
老版本PL/SQL Developer操作数据库导致ORA-00600[17113]
ASMM表空间强制终止DML操作导致ORA-600 [ktspfupdst-1]
表空间online出现ORA-00600[kcbz_check_objd_typ]处理过程
spfile被覆盖导致ORA-600[kmgs_parameter_update_timeout_1]

ORA-7445错误集合
ORA-7445[__milli_memcpy]分析
8i升级到9i出现ORA-07445[pevm_MOVC_i()+18]
ORA-07445[kslgetl()+120]/ORA-00108错误解决
遭遇ORA-07445[kkdliac()+346]使用odu抢救数据
ORA-07445[dbgrmqmqpk_query_pick_key()+0f88]
ORA-07445 [ACCESS_VIOLATION] [UNABLE_TO_READ] []

ORA-30013导致RAC 节点down掉

今天一朋友让我帮忙分析他们的9.2.0.2 rac 节点2异常down掉原因,相关信息如下:
前提信息

OS:HP-UX B.11.31
DB:9.2.0.2.0 RAC

节点2alert日志信息

Fri Sep  7 13:13:49 2012
ARC0: Completed archiving  log 11 thread 2 sequence 11651
Fri Sep  7 13:31:56 2012
Errors in file /oracle/admin/agent/udump/agent2_ora_797.trc:
ORA-00600: internal error code, arguments: [kgavsd_3], [0], [], [], [], [], [], []
ORA-00028: your session has been killed
Fri Sep  7 13:31:58 2012
Errors in file /oracle/admin/agent/bdump/agent2_pmon_5938.trc:
ORA-30013: undo tablespace 'UNDOTBS1' is currently in use
Fri Sep  7 13:31:58 2012
PMON: terminating instance due to error 30013
Fri Sep  7 13:31:58 2012
Errors in file /oracle/admin/agent/bdump/agent2_lms7_6033.trc:
ORA-30013: undo tablespace '' is currently in use
Fri Sep  7 13:31:58 2012
…………
Errors in file /oracle/admin/agent/bdump/agent2_lms0_6027.trc:
ORA-30013: undo tablespace '' is currently in use
Fri Sep  7 13:31:58 2012
System state dump is made for local instance
Fri Sep  7 13:32:03 2012
Instance terminated by PMON, pid = 5938
Fri Sep  7 14:34:35 2012

这里可以看到因为ORA-30013的错误使得pmon进程异常,从而使得该rac的节点2 down掉.同时这里还发现了ORA-00600[kgavsd_3]错误,是否是因为该ORA-600导致了数据库异常down还是一个偶然机会,我们继续分析

查看ORA-600[kgavsd_3]相关trace文件

*********START PLSQL RUNTIME DUMP************
***Got ORA-28 while running PLSQL***
***********END PLSQL RUNTIME DUMP************
*** 2012-09-07 13:31:56.740
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kgavsd_3], [0], [], [], [], [], [], []
ORA-00028: your session has been killed
Current SQL statement for this session:
--用户补档
DECLARE
  OUT_ERR_CODE NUMBER;
  OUT_ERR_MSG  VARCHAR2(1000);
  V_COUNT NUMBER;
BEGIN
    WHILE TRUE LOOP
      SELECT COUNT(*) INTO V_COUNT FROM amc_stat_log  where proc_name in('pRunOdsChannelWareData')
       and run_param=201208 AND STATE='A';
      IF V_COUNT>0 THEN
         dbms_output.put_line('exit loop '|| sysdate);
         EXIT;
      END IF;
      sys.Dbms_Lock.sleep(600);
      dbms_output.put_line('wake up '|| sysdate);
    END LOOP;
   PKG_AME_ODS_DATA.P_Add_TO_AgentServ(201208,OUT_ERR_CODE,OUT_ERR_MSG);
   PKG_AME_ODS_DATA.P_Update_Serv_Ware_ID(201208, OUT_ERR_CODE, OUT_ERR_MSG);
   PKG_AMS_SETTLE.P_COMMISION_51PRE_FLAG(201208,OUT_ERR_CODE, OUT_ERR_MSG);
END;
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
c000000e84ec59d0        14  anonymous block
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
Cannot find symbol in .
Cannot find symbol in .
Cannot find symbol in .
ksedmp()+512         call     9fffffffffff3940     000000000 ?
                                                   C000000000000A17 ?
                                                   40000000025B6540 ?
ksfdmp()+64          call     9fffffffffff3940     000000003 ?
kgerinv()+352        call     9fffffffffff3940     60000000000466B0 ?
                                                   000000003 ?
                                                   C000000000000714 ?
                                                   4000000004EBA6A0 ?
                                                   00001821B ?
                                                   6000000000468EA8 ?
kgesinv()+48         call     9fffffffffff3940     60000000000466B0 ?
                                                   600000000059BD98 ?
                                                   600000000046B070 ?
                                                   60000000000179C0 ?
                                                   6000000000017950 ?
kgesin()+112         call     9fffffffffff3940     60000000000466B0 ?
                                                   600000000059BD98 ?
                                                   4000000000B44C10 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFF4310 ?
$cold_kgavsd_stackl  call     9fffffffffff3940     60000000000466B0 ?
et_done()+1184                                     600000000059BD98 ?
                                                   4000000000B44C10 ?
                                                   000000001 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   00001FE19 ?
pbesd_stacklet_done  call     9fffffffffff3940     60000000000466B0 ?
()+96                                              000000002 ? 000000000 ?
                                                   9FFFFFFFBEC6AE68 ?
pfrrun()+3328        call     9fffffffffff3940     9FFFFFFFBEC6AE68 ?
                                                   C000000000001D42 ?
                                                   4000000001ABBDA0 ?
                                                   9FFFFFFFBEC6B06C ?
                                                   9FFFFFFFFFFF64F0 ?
                                                   6000000000468EA8 ?
                                                   000000000 ? 000000000 ?
peicnt()+480         call     0000000000000000     9FFFFFFFBEC6AEE8 ?
                                                   C000000000000996 ?
                                                   40000000019F3680 ?
kkxexe()+816         call     0000000000000000     9FFFFFFFFFFF64F0 ?
                                                   9FFFFFFFBEC6AE68 ?
                                                   C00000000000099B ?
                                                   4000000001DD90F0 ?
                                                   00000FE4F ?
                                                   9FFFFFFFFFFF5F00 ?
                                                   60000000000467F0 ?
                                                   4000000000B603F0 ?
opiexe()+11168       call     0000000000000000     000000000 ?
                                                   C000000000002C60 ?
                                                   4000000001BEF980 ?
                                                   00000800F ?
                                                   9FFFFFFFFFFF6470 ?
                                                   9FFFFFFFBEC6AEB2 ?
                                                   6000000000040900 ?
                                                   9FFFFFFFBEC6B534 ?
opiall0()+3184       call     _etext_f()+23058430  000000004 ? 0000000C4 ?
                              09110686928          9FFFFFFFFFFF7B40 ?
                                                   C000000000002BDF ?
                                                   4000000001B26CD0 ?
                                                   000000000 ? 00000C893 ?
                                                   9FFFFFFFFFFF6690 ?
Cannot find symbol in .
kpoal8()+2064        call     9fffffffffff7ad0     000000001 ?
                                                   9FFFFFFFFFFF8304 ?
                                                   FFFFFFFFBFFFFFFF ?
                                                   9FFFFFFFFFFF83E4 ?
                                                   FFFFFFFFFFE7FBDF ?
                                                   9FFFFFFFFFFF7B88 ?
                                                   000000000 ?
                                                   6000000000474528 ?
opiodr()+3584        call     9fffffffffff81fc     6000000000040950 ?
                                                   000000000 ? 000000000 ?
                                                   C000000000002C60 ?
                                                   4000000001C09FE0 ?
                                                   00000C50B ?
                                                   9FFFFFFFFFFF81F0 ?
                                                   9FFFFFFFFFFF81D8 ?
ttcpip()+3776        call     _etext_f()+23058430  00000005E ? 000000014 ?
                              09114957288          9FFFFFFFFFFFA5F0 ?
                                                   6000000000040918 ?
                                                   C000000000001ABD ?
                                                   4000000001AB3BA0 ?
                                                   000000000 ? 00000C59B ?
opitsk()+1872        call     9fffffffffffa200     6000000000049FC0 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFFA5F0 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFFA740 ?
                                                   9FFFFFFFFFFFA564 ?
                                                   9FFFFFFFBF780058 ?
                                                   000000000 ?
opiino()+3184        call     000000000000057b     000000000 ? 000000000 ?
                                                   C00000000000132B ?
                                                   4000000001F78730 ?
                                                   000008001 ?
opiodr()+3584        call     0000000000000000     6000000000548A38 ?
                                                   4000000000B606F0 ?
                                                   6000000000548A38 ?
                                                   C000000000002C60 ?
                                                   4000000001C09FE0 ?
                                                   00000A201 ?
                                                   9FFFFFFFFFFFBC90 ?
                                                   4000000000B606F0 ?
opidrv()+976         call     _etext_f()+23058430  00000003C ? 000000004 ?
                              09114957288          9FFFFFFFFFFFEFB0 ?
                                                   6000000000040918 ?
sou2o()+80           call     _etext_f()+23058430  000000004 ? 000000004 ?
                              09114957288          9FFFFFFFFFFFEFB0 ?
main()+352           call     _etext_f()+23058430  9FFFFFFFFFFFEFD0 ?
                              09114957288          9FFFFFFFFFFFEFD4 ?
                                                   60000000004744F0 ?
                                                   9FFFFFFFFFFFEFB0 ?
main_opd_entry()+80  call     _etext_f()+23058430  000000000 ?
                              09114957288          9FFFFFFFFFFFF498 ?
                                                   C000000000000004 ?
                                                   C00000000002BE30 ?
--------------------- Binary Stack Dump ---------------------
Process global information:
     process: c000000d6428c0c0, call: c000000e46e772a8, xact: 0000000000000000,
     curses: c000000d6437d020, usrses: c000000d6437d020
  ----------------------------------------
  SO: c000000d6428c0c0, type: 2, owner: 0000000000000000, flag: INIT/-/-/0x00
  (process) Oracle pid=282, calls cur/top: c000000e46e772a8/c000000e46e772a8, flag: (0) -
            int error: 28, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 0 0 0
              last post received-location: No post
              last process to post me: c000000d64234f18 1 6
              last post sent: 0 0 104
              last post sent-location: kglpsl: in loop
              last process posted by me: c000000d6428e900 23 0
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: c000000d62234ee0
    O/S info: user: oracle, term: UNKNOWN, ospid: 797
    OSD pid info: Unix process pid: 797, image: oracle@gzagent2 (TNS V1-V3)
    ----------------------------------------
    SO: c000000d6437d020, type: 4, owner: c000000d6428c0c0, flag: INIT/-/-/0x00
    (session) trans: 0000000000000000, creator: c000000d6428c0c0, flag: (41) USR/- BSY/-/-/-/KIL/-
              DID: 0000-0000-00000000, short-term DID: 0000-0000-00000000
              txn branch: 0000000000000000
              oct: 0, prv: 0, sql: c000000e76b03f10, psql: 0000000000000000, user: 31/CUSTOM
    O/S info: user: huangqianhai_lc, term: SVCTAG-D1MLV2X, ospid: 11124:11796, machine: WORKGROUP\SVCTAG-D1MLV2X
              program: plsqldev.exe
    application name: PL/SQL Developer, hash value=1190136663
    action name: 测试窗口 - 新建, hash value=3604520210
    last wait for 'null event' blocking sess=0x0 seq=142 wait_time=567341620
                =ea60, =0, =0
    temporary object counter: 0
      ----------------------------------------

通过这里可以看出来是因为pl/sql dev进行一个plsql的操作导致该错误发生,查询MOS[ID 403575.1]发现

Applies to:
Oracle Server - Enterprise Edition - Version: 9.2.0.7 and later   [Release: 9.2 and later ]
Information in this document applies to any platform.
***Checked for relevance on 27-Oct-2010***
Symptoms
The following errors appears in the alert log file :
Probe:read_pipe: receive failed, status 3
Probe:S:debug_loop: timeout. Action 1
*********START PLSQL RUNTIME DUMP************
***Got ORA-604 while running PLSQL***
***********END PLSQL RUNTIME DUMP************
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kgavsd_3], [0], [], [], [], [], [], []
ORA-00604: error occurred at recursive SQL level 2
Current SQL statement for this session:
begin :id := sys.dbms_transaction.local_transaction_id; end;
.
Cause
The ora-600 kgavsd_3 appears when calling kgavsd_stacklet_done function which is related to PLSQL DEBUG.
From traces, dbms_debug package is being used during trace generation.
The return code of 3 further indicates that the dbms_pipe message was interrupted.
Probably user cancelled a plsql program, so the appeared while trying to dump the stack
Solution
There is no data corruption over here.
The error appears to be due to abnormal termination of aPL/SQL Developer application while executing a PL/SQL block.
Changing the PL/SQL and/or the procedure code could help in avoiding this error message.
Hence, this error can be safely ignored.

查找trace文件确实发现有name=SYS.DBMS_DEBUG,进一步表明该错误是由于plsql dev工具使用debug模式运行上面的plsql而引起该错误的发生,但是因为mos中记录和错误不是完全的一致,所以不能十分确定是该错误导致数据库down掉

继续分析ORA-30013

Error:      ORA-30013  (ORA-30013)
Text:      undo tablespace '%s' is currently in use
---------------------------------------------------------------------------
Cause:    the specified undo tablespace is currently used by another instance.
Action:    Wait for the undo tablespace to become available or change to another name and reissue the statement.

这个说明是没有疑问的:因为2节点配置的当前undo是UNDOTBS2,而UNDOTBS1是1节点使用的,证明这里的undo确实发生了错误,继续查询mos发现Bug 3368552

Hdr: 3368552 9.2.0.3 RDBMS 9.2.0.3 RAC PRODID-5 PORTID-23
Abstract: RAC:  ORA-30013 WHEN INSTANCE 2 ATTEMPTS TO ACCESS UNDO TABLESPACE OF INSTANCE 1
*** 01/12/04 06:21 am ***
TAR:
----
3554549.995
PROBLEM:
--------
The RAC database has been stable, but experienced an instance termination due
to ORA-30031 error in the alert log (instance 2):
...
Tue Dec 23 03:01:46 2003
ARC1: Evaluating archive  log 4 thread 2 sequence 1116
ARC1: Beginning to archive log 4 thread 2 sequence 1116
Creating archive destination LOG_ARCHIVE_DEST_1:
'/oracle/oradata/VLDB/logs/archives/VLDBN2/VLDB_0000001116_0002.arc'
ARC1: Completed archiving  log 4 thread 2 sequence 1116
Tue Dec 23 08:14:09 2003
Errors in file /oracle/admin/VLDB/bdump/vldbn2_pmon_22860.trc:
ORA-30013: undo tablespace 'UNDOTBS1' is currently in use
Tue Dec 23 08:14:09 2003
PMON: terminating instance due to error 30013
Tue Dec 23 08:14:10 2003
System state dump is made for local instance
Tue Dec 23 08:14:12 2003
Trace dumping is performing id=[cdmp_20031223081410]
Tue Dec 23 08:14:14 2003
Instance terminated by PMON, pid = 22860
<eof>
Instance 1 alert log shows only the reconfiguration and the cdump info:
..
Tue Dec 23 03:54:13 2003
Errors in file /oracle/admin/VLDB/udump/vldbn1_ora_13564.trc:
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 4
Tue Dec 23 08:14:10 2003
Trace dumping is performing id=[cdmp_20031223081410]
Tue Dec 23 08:14:12 2003
Reconfiguration started
List of nodes: 0,
Global Resource Directory frozen
one node partition
Communication channels reestablished
...

因为在9.2.0.3的RAC中有着该bug,那么我们可以大胆猜测在9.2.0.2中应该存在该bug,那么结合上面的ORA-00600[kgavsd_3]错误,我们大概还原该事故的全部:
1.节点1 dml操作了程序中报错的plsql中要范围的部分表对象,但是未提交(或者正在执行)
2.节点2 有用户使用pl/sql dev去执行程序中的plsql,因为是debug模式执行,需要UNDOTBS1的块来构建cr,从而使得节点2去访问UNDOTBS1,引发了Bug 3368552 从而使得数据库直接kill掉该plsql dev会话,进而出现ORA-00600[kgavsd_3]错误和pmon进程异常使得节点2 down掉

TXChecker初试

当我们对数据库进行异常恢复,很多时候要选择屏蔽回滚段,但是我们有没有办法来评估屏蔽回滚段到底会对我们的数据库的数据产生多大影响呢?其实我们可以通过TXChecker工具来评估数据库undo异常时候受到影响的数据对象有哪些,从而进一步确定是否真的需要对其undo下手处理.
模拟数据异常事务为提交情况

[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Thu Aug 30 20:32:27 2012
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to an idle instance.
SQL> startup;
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             100665820 bytes
Database Buffers          209715200 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> create table t_xifenfei tablespace system
  2  as
  3  select * from dba_objects;
Table created.
SQL> create table chf.t_xifenfei tablespace users
  2  as
  3  select * from dba_objects;
Table created.
SQL> delete from t_xifenfei where rownum<10;
9 rows deleted.
SQL> delete from chf.t_xifenfei where rownum<100;
99 rows deleted.
SQL> alter system checkpoint;
System altered.
SQL> shutdow abort;
ORACLE instance shut down.

在system和users表空间分别模拟了sys和chf的t_xifenfei表含事务未提交情况

TXChecker用法说明

[oracle@xifenfei txchecker]$ ./TXChecker
TXChecker - v1.4 by Center Of Expertise (COE), Oracle Corporation (build 10/16/07)
Usage is: TXChecker [options]
Options:
  -a                     When scanning datafiles (with -d/-f/-l/-t options) report objects using any of the undo
                         segments (not just those with errors) (OPTIONAL)
  -b                     For objects found, print the datablock addresses. See readme for further details (OPTIONAL)
  -c<controlfile_name>   Fully qualified controlfile name to read (MANDATORY)
  -d                     Scan database for active TXs (use when undo not available) (OPTIONAL)
  -f<filename>           Scan the named datafile for active TXs (OPTIONAL)
  -g                     Indicates you want to find all blocks taking part in transactions with
                         a USN > than the USN supplied in -x<XID> parameter (same constraints as -w) (OPTIONAL)
  -l<listfile>           Scan all the datafiles listed in the listfile for active TXs (OPTIONAL)
  -m<minutes>            Number of minutes used to consider a TX as active (1-120) (DEFAULTS TO 60 MINUTES)
  -p                     Show the names and last known status of the UNDO segments (OPTIONAL)
  -s                     Skip read-only or offline normal datafiles (OPTIONAL)
  -t<tablespace>         Scan all the datafiles for this tablespace (OPTIONAL)
  -u                     Report ITL entries active if marked with an upper bound ('U' flag) fast commit SCN
                         instead if active transactions (OPTIONAL)
  -w<wrap#>              Wrap# for XID in ITL entry to report blocks where wrap# > this one (OPTIONAL)
                         Must use -x with this option. See the readme for details
  -x<XID>                XID for transaction wanting to search for (OPTIONAL)
                         Use format rrrr.ssss.wwwwwwww using Hexadecimal numbers
                         See the readme for full instructions on using -x, -w and -g options
NOTE: Options -d/-f/-l/-t are exclusive, and only one should be specified.
[/sql]
<strong>TXChecker初始</strong>

[oracle@xifenfei txchecker]$ ./TXChecker -c/u01/oracle/oradata/XFF/control01.ctl -d -a
TXChecker - v1.4 by Center Of Expertise (COE), Oracle Corporation (build 10/16/07)
Database Name: XFF   Version: 10.2.0
*** Database last checkpointed at 08/30/2012 20:34:43 (SCN: 0xa.0x1e142)
*** Using 60 minutes to find most active transactions (-m60)
*** Undo Segments:
USN:   0  Name: SYSTEM          TBS#:  0 File:   1  Block:      9  Instance:  0  SMU: N  Status:  3 - Online
                                SCN: 0000.00000000   XactSQN: 0x00000000   UndoSQN: 0x00000000
USN:   1  Name: _SYSSMU1$       TBS#:  1 File:   2  Block:      9  Instance:  0  SMU: Y  Status:  3 - Online
                                SCN: 000a.000191c3   XactSQN: 0x000000d2   UndoSQN: 0x0000013c
…………省略
USN:   9  Name: _SYSSMU9$       TBS#:  1 File:   2  Block:    137  Instance:  0  SMU: Y  Status:  3 - Online
                                SCN: 000a.000191d8   XactSQN: 0x0000011a   UndoSQN: 0x000000dc
USN:  10  Name: _SYSSMU10$      TBS#:  1 File:   2  Block:    153  Instance:  0  SMU: Y  Status:  3 - Online
                                SCN: 000a.000191d7   XactSQN: 0x000000c1   UndoSQN: 0x000000d9
…………省略
USN:  20  Name: _SYSSMU20$      TBS#:  5 File:   5  Block:    153  Instance:  0  SMU: Y  Status:  1 - Invalid / Dropped
                                SCN: 0000.00070ec6   XactSQN: 0x00000002   UndoSQN: 0x00000001
*** Active Transactions:
USN: 6  Name: _SYSSMU6$  File: 2  Block: 89  Instance: 0  Status: 3 - Online
* Active TX at slot 6  #undo blocks: 3  Last bk: 2.90
Obj#:  51938  Name: CHF.T_XIFENFEI                                    Type: TABLE               Undo recs: 99
              Used undo segment IDs: 6
Obj#:  51937  Name: SYS.T_XIFENFEI                                    Type: TABLE               Undo recs: 9
              Used undo segment IDs: 6
         File (validate_objects.sql) being created for object validattion already exists
         and will be overwritten!!!
         Do you want to continue overwriting [Y/N]?y
*** Undo segments (headers) that encountered errors preventing Active TX scan:
USN:  11  Name: _SYSSMU11$      File:   5  Block:       9  Instance:  0  Error: 27 - Undo segment was dropped
…………省略
USN:  20  Name: _SYSSMU20$      File:   5  Block:     153  Instance:  0  Error: 27 - Undo segment was dropped
WARNING: Analyzing the full database for active transactions will take some time!
         Are you sure you want to analyze the full database [Y/N]?
         Are you sure you want to analyze the full database [Y/N]?y
*** Scanning database for datablocks that may require undo (PLEASE WAIT...):
*** Asterisk ('*') denotes blocks being updated since 08/08/2012 03:30:32 (SCN: 0x0.0x79607)
Scanning datafile:  3 - /u01/oracle/oradata/XFF/sysaux01.dbf (SYSAUX)                           - Active TX blocks: 147 *
Scanning datafile:  1 - /u01/oracle/oradata/XFF/system01.dbf (SYSTEM)                           - Active TX blocks: 11597 *
--undo表空间跳过
Undo datafile (/u01/oracle/oradata/XFF/undotbs01.dbf) - SKIPPING
Scanning datafile:  4 - /u01/oracle/oradata/XFF/users01.dbf (USERS)                             - Active TX blocks: 2 *
--因为数据文件丢失控制文件中offline的跳过(其实只要数据文件丢失就会跳过)
Cannot access datafile (/u01/oracle/oradata/XFF/xifenfei01.dbf) (error 2 - No such file or directory) - SKIPPING
Cannot access datafile (/u01/oracle/oradata/XFF/xifenfei02.dbf) (error 2 - No such file or directory) - SKIPPING
Scanning datafile:  7 - /u01/oracle/oradata/XFF/xifenfei03.dbf (XIFENFEI2)                      - Active TX blocks: 0
*** Objects that may require undo data:
*** Asterisk ('*') denotes blocks being updated since 08/08/2012 03:30:32 (SCN: 0x0.0x79607)
DataObj#:  51938  Name: CHF.T_XIFENFEI                                    Type: TABLE               Datablocks: 2 *
Used undo segment IDs: 6
DataObj#:  46434  Name: MDSYS.SYS_IL0000046432C00006$$                    Type: INDEX               Datablocks: 4
Used undo segment IDs: 2
DataObj#:    124  Name: SYS.I_ACCESS1                                     Type: INDEX               Datablocks: 27
Used undo segment IDs: 1 2 3 4 5 6 7 8 9 10
…………省略
DataObj#:   8824  Name: SYS.SYS_IL0000008822C00008$$                      Type: INDEX               Datablocks: 1 *
Used undo segment IDs: 4
DataObj#:  51937  Name: SYS.T_XIFENFEI                                    Type: TABLE               Datablocks: 1 *
Used undo segment IDs: 6
DataObj#:  51557  Name: SYS.UTL_RECOMP_COMPILED                           Type: TABLE               Datablocks: 1
Used undo segment IDs: 8
…………省略
DataObj#:  42131  Name: XDB.XDB$ELEMENT_PROPNUMBER                        Type: INDEX               Datablocks: 1
Used undo segment IDs: 2
*** Use validate_objects.sql script file to validate the structure of possibly corrupt objects
    if the undo required is not available.
Undo Segment Usage Summary
**************************
*** Undo segments identified in use by active transaction datablocks AFTER 08/08/2012 03:30:32 (SCN: 0x0.0x79607):
USN:   1  Name: _SYSSMU1$
USN:   2  Name: _SYSSMU2$
USN:   3  Name: _SYSSMU3$
USN:   4  Name: _SYSSMU4$
USN:   5  Name: _SYSSMU5$
USN:   6  Name: _SYSSMU6$
USN:   9  Name: _SYSSMU9$
*** Undo segments identified in use by active transacation datablocks BEFORE 08/08/2012 03:30:32 (SCN: 0x0.0x79607):
USN:   1  Name: _SYSSMU1$
USN:   2  Name: _SYSSMU2$
USN:   3  Name: _SYSSMU3$
USN:   4  Name: _SYSSMU4$
USN:   5  Name: _SYSSMU5$
USN:   7  Name: _SYSSMU7$
USN:   8  Name: _SYSSMU8$
USN:   9  Name: _SYSSMU9$
USN:  10  Name: _SYSSMU10$
NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!
*************************************************************************************
Upload the logfile (TXChecker_083012_2042_XFF.log) to Oracle Support Services for analysis.
Do NOT attempt to force the database open until the logfile has been analyzed.

验证对象脚本

[oracle@xifenfei txchecker]$ more validate_objects.sql
rem validate_objects.sql - checks strcuture of objects needing unavailable undo data
rem                        Created by findtxns program, Oracle Corporation
set echo on
ANALYZE TABLE CHF.T_XIFENFEI VALIDATE STRUCTURE CASCADE;
ANALYZE INDEX MDSYS.SYS_IL0000046432C00006$$ VALIDATE STRUCTURE;
ANALYZE INDEX SYS.I_ACCESS1 VALIDATE STRUCTURE;
…………省略
ANALYZE INDEX XDB.SYS_C003167 VALIDATE STRUCTURE;
ANALYZE INDEX XDB.XDB$ELEMENT_PROPNAME VALIDATE STRUCTURE;
ANALYZE INDEX XDB.XDB$ELEMENT_PROPNUMBER VALIDATE STRUCTURE;