手动提交分布式事务一例

一.alert文件中出现了很多类此记录

Fri Feb 10 05:25:01 2012
Errors in file /tmp/recover/ahcx216_reco_7956.trc:
ORA-12154: TNS:could not resolve service name
Fri Feb 10 05:25:01 2012
Errors in file /tmp/recover/ahcx216_reco_7956.trc:
ORA-12154: TNS:could not resolve service name

这里可以看出来两个信息:
1.出错的进程是rec0进程,而该进程的作用是解决分布式事务失败后遗留问题(事务提交或者回滚)
2.错误信息是tns不能被解析

二.查看trace文件

/tmp/recover/ahcx216_reco_7956.trc
Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production
ORACLE_HOME = /opt/oracle/product/9.2.0/db_1
System name:    Linux
Node name:      localhost.localdomain
Release:        2.6.9-89.0.0.0.1.ELhugemem
Version:        #1 SMP Tue May 19 04:38:38 EDT 2009
Machine:        i686
Instance name: ahcx216
Redo thread mounted by this instance: 1
Oracle process number: 7
Unix process pid: 7956, image: oracle@localhost.localdomain (RECO)
*** SESSION ID:(6.1) 2012-02-10 04:58:24.886
*** 2012-02-10 04:58:24.886
ERROR, tran=6.1.712757, session#=1, ose=0:
ORA-12154: TNS:could not resolve service name
………………
*** 2012-02-10 05:25:01.580
ERROR, tran=6.1.712757, session#=1, ose=0:
ORA-12154: TNS:could not resolve service name
*** 2012-02-10 05:25:01.627
ERROR, tran=12.19.99059, session#=1, ose=0:
ORA-12154: TNS:could not resolve service name

通过这里我们可以看出事务id分别为12.19.99059和6.1.712757有问题

三.查看dba_2pc_pending视图

SQL> select local_tran_id,state,fail_time,retry_time from dba_2pc_pending;
LOCAL_TRAN_ID     STATE            FAIL_TIME           RETRY_TIME
-------------- ---------------- --------------       --------------
6.1.712757	collecting	2010/12/1 13:39:03   2012/2/10 5:38:52
12.19.99059	collecting	2010/12/1 15:56:26   2012/2/10 5:38:52

可以看出,果然有两个分布式事务在2010年12月1日出了问题(本库是一个问题库,在把库拉起来后发现该问题)
因为是异机恢复,而且间隔时间较长,很多tns的信息都已经不存在,所以需要手工提交分布式事务

四.手动提交事务

SQL> BEGIN
  2  DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('6.1.712757');
  3  DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059');
  4  END;
  5  /
BEGIN
*
第 1 行出现错误:
ORA-30019: Illegal rollback Segment operation in Automatic Undo mode
ORA-06512: at "SYS.DBMS_TRANSACTION", line 65
ORA-06512: at "SYS.DBMS_TRANSACTION", line 85
ORA-06512: at line 2
SQL> alter session set "_smu_debug_mode"=4;
Session altered.
-------------------------------------------------------------
--设置UNDO_SUPPRESS_ERRORS=true也可以解决此问题
alter system set UNDO_SUPPRESS_ERRORS = TRUE;
EXECUTE DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('<事务ID>');
commit;
alter system set UNDO_SUPPRESS_ERRORS = false;
--------------------------------------------------------------
SQL> commit;
SQL> exec DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('6.1.712757');
PL/SQL procedure successfully completed.
SQL> exec DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059');
BEGIN DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059'); END;
*
ERROR at line 1:
ORA-01453: SET TRANSACTION must be first statement of transaction
ORA-06512: at "SYS.DBMS_TRANSACTION", line 65
ORA-06512: at "SYS.DBMS_TRANSACTION", line 85
ORA-06512: at line 1
--第一个分布式事务处理后,未提交导致
SQL> commit;
Commit complete.
SQL> exec DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059');
PL/SQL procedure successfully completed.
SQL> commit;
Commit complete.

五.补充说明
开始在另一个会话中,执行失败原因

SQL> alter session set "_smu_debug_mode"=4;
会话已更改。
SQL> commit;
提交完成。
SQL> BEGIN
  2  DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('6.1.712757');
  3  DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059');
  4  END;
  5  /
BEGIN
*
第 1 行出现错误:
ORA-01453: SET TRANSACTION must be first statement of transaction
ORA-06512: at "SYS.DBMS_TRANSACTION", line 65
ORA-06512: at "SYS.DBMS_TRANSACTION", line 85
ORA-06512: at line 3

因为这里的begin end中包含了两个事务的清理,在清理完第一个事务之后,需要提交才能够清理第二个,这里因为没有提交,导致ORA-01453错误。

通过sql语句获取awr/statspack逻辑读/物理读

在日常的性能监控中,我们经常需要需要通过一段时间内数据库的逻辑读/物理读来大致反映数据库的性能情况,这里列出通过awr和statspack来获取相关数据(查询最近30天数据,除掉第一条和最后一条数据)
awr逻辑读

WITH A AS
 (SELECT B.SNAP_ID, SUM(VALUE)/1024/1024/1024*8192 VALUE
    FROM DBA_HIST_SYSSTAT B
   WHERE B.DBID = (SELECT DBID FROM V$DATABASE)
          AND b.INSTANCE_NUMBER = &INST_NUM
     AND B.STAT_NAME IN ('session logical reads')
   GROUP BY B.SNAP_ID
   ORDER BY SNAP_ID)
SELECT A.SNAP_ID,
       LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "START_VALUE(G)",
       VALUE "END_VALUE(G)",
       TO_CHAR(END_INTERVAL_TIME,'YYYY-MM-DD HH24:MI:SS')"END_TIME",
       VALUE - LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "D-VALUE(G)"
  FROM A,(SELECT END_INTERVAL_TIME,SNAP_ID
    FROM DBA_HIST_SNAPSHOT B
   WHERE  B.DBID = (SELECT dbid FROM v$database)
     AND B.INSTANCE_NUMBER = &INST_NUM) B
     WHERE A.SNAP_ID=B.snap_id
     AND END_INTERVAL_TIME>=SYSDATE-30;

awr物理读

WITH A AS
 (SELECT B.SNAP_ID, SUM(VALUE)/1024/1024/1024*8192 VALUE
    FROM DBA_HIST_SYSSTAT B
   WHERE B.DBID = (SELECT DBID FROM V$DATABASE)
          AND b.INSTANCE_NUMBER = &INST_NUM
     AND B.STAT_NAME IN ('physical reads')
   GROUP BY B.SNAP_ID
   ORDER BY SNAP_ID)
SELECT A.SNAP_ID,
       LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "START_VALUE(G)",
       VALUE "END_VALUE(G)",
       TO_CHAR(END_INTERVAL_TIME,'YYYY-MM-DD HH24:MI:SS')"END_TIME",
       VALUE - LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "D-VALUE(G)"
  FROM A,(SELECT END_INTERVAL_TIME,SNAP_ID
    FROM DBA_HIST_SNAPSHOT B
   WHERE  B.DBID = (SELECT dbid FROM v$database)
     AND B.INSTANCE_NUMBER = &INST_NUM) B
     WHERE A.SNAP_ID=B.snap_id
     AND END_INTERVAL_TIME>=SYSDATE-30;

statspack逻辑读

WITH A AS
 (SELECT B.SNAP_ID, SUM(VALUE)/1024/1024/1024*8192 VALUE
    FROM STATS$SYSSTAT B
   WHERE B.DBID = (SELECT DBID FROM V$DATABASE)
     AND B.INSTANCE_NUMBER = &INST_NUM
     AND B.NAME IN ('session logical reads')
   GROUP BY B.SNAP_ID
   ORDER BY SNAP_ID)
SELECT A.SNAP_ID,
       LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "START_VALUE(G)",
       VALUE "END_VALUE(G)",
       TO_CHAR(SNAP_TIME, 'YYYY-MM-DD HH24:MI:SS') "END_TIME",
       VALUE - LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "D-VALUE(G)"
  FROM A,
       (SELECT SNAP_TIME, SNAP_ID
          FROM STATS$SNAPSHOT B
         WHERE B.DBID = (SELECT DBID FROM V$DATABASE)
           AND B.INSTANCE_NUMBER = &INST_NUM) B
 WHERE A.SNAP_ID = B.SNAP_ID
  AND snap_time>=SYSDATE-30;

statspack物理读

WITH A AS
 (SELECT B.SNAP_ID, SUM(VALUE)/1024/1024/1024*8192 VALUE
    FROM STATS$SYSSTAT B
   WHERE B.DBID = (SELECT DBID FROM V$DATABASE)
     AND B.INSTANCE_NUMBER = &INST_NUM
     AND B.NAME IN ('physical reads')
   GROUP BY B.SNAP_ID
   ORDER BY SNAP_ID)
SELECT A.SNAP_ID,
       LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "START_VALUE(G)",
       VALUE "END_VALUE(G)",
       TO_CHAR(SNAP_TIME, 'YYYY-MM-DD HH24:MI:SS') "END_TIME",
       VALUE - LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "D-VALUE(G)"
  FROM A,
       (SELECT SNAP_TIME, SNAP_ID
          FROM STATS$SNAPSHOT B
         WHERE B.DBID = (SELECT DBID FROM V$DATABASE)
           AND B.INSTANCE_NUMBER = &INST_NUM) B
 WHERE A.SNAP_ID = B.SNAP_ID
 AND snap_time>=SYSDATE-30;

ODU恢复PACKAGE/VIEW/DBLINK等数据脚本操作

当数据库不能拉起来时候,我们可以通过odu找回丢失的数据,今天客户出要求,需要我帮忙找回不能起来库中的包,过程,函数,视图,触发器等之类东西。这些东西都是存在system空间中,存在一些系统的基表中,我们如果能够拿到这些基表的数据然后进行处理,原则上就可以得到客户需要的东西
一、PROCEDURE/FUNCTION/PACKAGE/PACKAGE BODY/TRIGGER/TYPE/TYPE BODY之类恢复
1. 查看这些数据存储在什么基表中

select u.name, o.name,
decode(o.type#, 7, 'PROCEDURE', 8, 'FUNCTION', 9, 'PACKAGE',
               11, 'PACKAGE BODY', 12, 'TRIGGER', 13, 'TYPE', 14, 'TYPE BODY',
               'UNDEFINED'),
s.line, s.source
from sys.obj$ o, sys.source$ s, sys.user$ u
where o.obj# = s.obj#
  and o.owner# = u.user#
  and ( o.type# in (7, 8, 9, 11, 12, 14) OR
       ( o.type# = 13 AND o.subname is null))
union all
select u.name, o.name, 'JAVA SOURCE', s.joxftlno, s.joxftsrc
from sys.obj$ o, x$joxfs s, sys.user$ u
where o.obj# = s.joxftobn
  and o.owner# = u.user#
  and o.type# = 28;

通过这个sql知道主要存储在sys.obj$ o, sys.source$ s, sys.user$中(至于x$joxfs表,我暂时在数据库中未找到,不能使用odu提取数据,也就是说我这里不能恢复出来JAVA SOURCE)

2. 使用odu导出数据(注意output_format设置为dmp,防止出现意外)
使用类此unload table sys.obj$ 下面语句导出数据

3. 加载这些数据到新库中(切记不要使用sys schema)
使用类此imp chf/xifenfei file=d:/SYS_OBJ$.dmp fromuser=sys touser=chf导入数据

4. 在查询出需要对象

   SELECT u.name USER_NAME,
         o.name OBJECT_nAME,
         DECODE (o.type#,
                 7, 'PROCEDURE',
                 8, 'FUNCTION',
                 9, 'PACKAGE',
                 11, 'PACKAGE BODY',
                 12, 'TRIGGER',
                 13, 'TYPE',
                 14, 'TYPE BODY',
                 'UNDEFINED')
            OBJECT_tYPE,
         s.line,
         s.source
    FROM chf.obj$ o, chf.source$ s, chf.user$ u
   WHERE o.obj# = s.obj# AND o.owner# = u.user#
         AND (o.type# IN (7, 8, 9, 11, 12, 14)
              OR (o.type# = 13 AND o.subname IS NULL))
--过来条件选择出需要对象
-- AND U.name='BSSADMIN'
ORDER BY U.NAME,
         O.NAME,
         O.TYPE#,
         S.LINE;

5. 使用说明
在line=1的source前面加上create or replace ,使用合适的用户创建需要对象

二、VIEW恢复
1. 找出视图存储基表
sys.obj$,sys.view$,sys.user$
2. 使用odu导出需要表
3. 使用imp导入数据库
4. 查询语句

    SELECT u.name username, o.name viewname, v.text
  FROM chf.obj$ o, chf.view$ v, chf.user$ u
 WHERE o.obj# = v.obj# AND o.owner# = u.user#
 --过滤条件,挑选需要视图
 AND u.name = 'MAS_ADMIN';

5. 使用说明
create VIEW OR REPLACE username.viewname AS+TEXT中内容

三、dblink恢复
1. 找出关联基表
Sys.link$和sys.user$
2. odu导出相关表
3. imp导入数据
4. 查询语句

SELECT U.NAME   USERNAME,
       L.NAME   DBLINK_NAME,
       L.USERID LOGIN_USER,
       L.HOST   TNS,
       L.CTIME
  FROM CHF.LINK$ L, CHF.USER$ U
 WHERE L.OWNER# = U.USER#
  --过滤条件
   AND U.NAME = 'PUBLIC'

5. 使用说明
根据查询出来信息,自己创建DBLINK

FAST_START_PARALLEL_ROLLBACK与回滚恢复

1.模拟产生大事务需回滚

[oracle@node1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Thu Feb 16 12:47:08 2012
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
SQL> show sga;
Total System Global Area  622149632 bytes
Fixed Size                  2230912 bytes
Variable Size             406848896 bytes
Database Buffers          205520896 bytes
Redo Buffers                7548928 bytes
SQL> create table chf.xifenfei tablespace users
  2  as
  3  select * from dba_objects;
--下面两句多次执行
SQL> insert into chf.xifenfei
  2  select * from chf.xifenfei;
73831 rows created.
SQL> commit;
Commit complete.
SQL> select count(*) from  chf.xifenfei;
  COUNT(*)
----------
  18900736
SQL> select bytes/1024/1024 from dba_segments where segment_name='XIFENFEI';
BYTES/1024/1024
---------------
           2103
--删除数据不提交
SQL> delete from chf.xifenfei;
18900736 rows deleted.
--直接kill掉ora_dbw进程

2.FAST_START_PARALLEL_ROLLBACK=LOW(默认值)

SQL> select  undoblockstotal "Total", undoblocksdone "Done", undoblockstotal-undoblocksdone
  2   "ToDo",decode(cputime,0,'unknown',to_char(sysdate+(((undoblockstotal-undoblocksdone)
  3   / (undoblocksdone / cputime)) / 86400),'yyyy-mm-dd hh24:mi:ss'))
  4  "Estimated time to complete",to_char(sysdate,'yyyy-mm-dd hh24:mi:ss')
  5  from v$fast_start_transactions;
     Total       Done       ToDo Estimated time to c TO_CHAR(SYSDATE,'YY
---------- ---------- ---------- ------------------- -------------------
    545624     103020     442604 2012-02-16 13:57:38 2012-02-16 13:47:02
SQL> /
     Total       Done       ToDo Estimated time to c TO_CHAR(SYSDATE,'YY
---------- ---------- ---------- ------------------- -------------------
    545624     122614     423010 2012-02-16 13:57:42 2012-02-16 13:47:31
--每秒钟回滚undo数据块数量
SQL> select (122614-103020)/29 from dual;
(122614-103020)/29
------------------
        675.655172
--数据库并发回滚进程数
[oracle@node1 ~]$ ps -ef|grep ora_p0
oracle   24901     1  4 13:44 ?        00:00:15 ora_p000_chf
oracle   24903     1  3 13:44 ?        00:00:12 ora_p001_chf
oracle   24905     1  3 13:44 ?        00:00:12 ora_p002_chf
oracle   24907     1  3 13:44 ?        00:00:12 ora_p003_chf
oracle   24909     1  3 13:44 ?        00:00:12 ora_p004_chf
oracle   24911     1  3 13:44 ?        00:00:12 ora_p005_chf
oracle   24913     1  3 13:44 ?        00:00:12 ora_p006_chf
oracle   24915     1  3 13:44 ?        00:00:12 ora_p007_chf
oracle   24917     1  3 13:44 ?        00:00:12 ora_p008_chf
oracle   24919     1  3 13:44 ?        00:00:12 ora_p009_chf
oracle   24921     1  3 13:44 ?        00:00:12 ora_p010_chf
oracle   24923     1  3 13:44 ?        00:00:12 ora_p011_chf
oracle   24925     1  3 13:44 ?        00:00:12 ora_p012_chf
oracle   24927     1  3 13:44 ?        00:00:12 ora_p013_chf
oracle   24929     1  3 13:44 ?        00:00:12 ora_p014_chf
oracle   24931     1  3 13:44 ?        00:00:12 ora_p015_chf
说明:该机器操作系统是8个CPU
并发数=CPU*2

3.FAST_START_PARALLEL_ROLLBACK =HIGH

SQL> alter system set FAST_START_PARALLEL_ROLLBACK =HIGH;
System altered.
SQL> select  undoblockstotal "Total", undoblocksdone "Done", undoblockstotal-undoblocksdone
  2   "ToDo",decode(cputime,0,'unknown',to_char(sysdate+(((undoblockstotal-undoblocksdone)
  3   / (undoblocksdone / cputime)) / 86400),'yyyy-mm-dd hh24:mi:ss'))
  4  "Estimated time to complete",to_char(sysdate,'yyyy-mm-dd hh24:mi:ss')
  5  from v$fast_start_transactions;
     Total       Done       ToDo Estimated time to c TO_CHAR(SYSDATE,'YY
---------- ---------- ---------- ------------------- -------------------
    380434      25351     355083 2012-02-16 14:03:53 2012-02-16 13:49:39
SQL> /
     Total       Done       ToDo Estimated time to c TO_CHAR(SYSDATE,'YY
---------- ---------- ---------- ------------------- -------------------
    380434      39592     340842 2012-02-16 14:03:33 2012-02-16 13:50:12
--每秒钟回滚undo数据块数量
SQL> select (39592-25351)/33 from dual;
(39592-25351)/33
----------------
      431.545455
--数据库并发回滚进程数
[oracle@node1 ~]$ ps -ef|grep ora_p0
oracle   24901     1  4 13:44 ?        00:00:15 ora_p000_chf
oracle   24903     1  3 13:44 ?        00:00:12 ora_p001_chf
oracle   24905     1  3 13:44 ?        00:00:12 ora_p002_chf
oracle   24907     1  3 13:44 ?        00:00:12 ora_p003_chf
oracle   24909     1  3 13:44 ?        00:00:12 ora_p004_chf
oracle   24911     1  3 13:44 ?        00:00:12 ora_p005_chf
oracle   24913     1  3 13:44 ?        00:00:12 ora_p006_chf
oracle   24915     1  3 13:44 ?        00:00:12 ora_p007_chf
oracle   24917     1  3 13:44 ?        00:00:12 ora_p008_chf
oracle   24919     1  3 13:44 ?        00:00:12 ora_p009_chf
oracle   24921     1  3 13:44 ?        00:00:12 ora_p010_chf
oracle   24923     1  3 13:44 ?        00:00:12 ora_p011_chf
oracle   24925     1  3 13:44 ?        00:00:12 ora_p012_chf
oracle   24927     1  3 13:44 ?        00:00:12 ora_p013_chf
oracle   24929     1  3 13:44 ?        00:00:12 ora_p014_chf
oracle   24931     1  3 13:44 ?        00:00:12 ora_p015_chf
oracle   25072     1  0 13:48 ?        00:00:01 ora_p016_chf
oracle   25074     1  0 13:48 ?        00:00:01 ora_p017_chf
oracle   25076     1  0 13:48 ?        00:00:01 ora_p018_chf
oracle   25078     1  0 13:48 ?        00:00:01 ora_p019_chf
oracle   25080     1  0 13:48 ?        00:00:01 ora_p020_chf
oracle   25082     1  0 13:48 ?        00:00:01 ora_p021_chf
oracle   25084     1  0 13:48 ?        00:00:01 ora_p022_chf
oracle   25086     1  0 13:48 ?        00:00:01 ora_p023_chf
oracle   25088     1  0 13:48 ?        00:00:01 ora_p024_chf
oracle   25090     1  0 13:48 ?        00:00:01 ora_p025_chf
oracle   25092     1  0 13:48 ?        00:00:01 ora_p026_chf
oracle   25094     1  0 13:48 ?        00:00:01 ora_p027_chf
oracle   25096     1  0 13:48 ?        00:00:01 ora_p028_chf
oracle   25098     1  0 13:48 ?        00:00:01 ora_p029_chf
oracle   25100     1  0 13:48 ?        00:00:01 ora_p030_chf
oracle   25102     1  0 13:48 ?        00:00:01 ora_p031_chf
1.说明问题:直接修改FAST_START_PARALLEL_ROLLBACK =HIGH后,
  数据库在原来并发进程基础上,又重新启动额外进程
2.修改FAST_START_PARALLEL_ROLLBACK后,以前回滚过的数据块是成功的
  v$fast_start_transactions视图重新开始计算
3.并发数=CPU*4
[/shell]
<strong>4.FAST_START_PARALLEL_ROLLBACK=FALSE</strong>

SQL> alter system set FAST_START_PARALLEL_ROLLBACK=FALSE;
System altered.
--直接修改为FALSE后,观察到数据库的并发等资源都没有释放,重启数据库释放资源继续试验
SQL> startup force;
ORACLE instance started.
Total System Global Area  622149632 bytes
Fixed Size                  2230912 bytes
Variable Size             406848896 bytes
Database Buffers          205520896 bytes
Redo Buffers                7548928 bytes
Database mounted.
Database opened.
SQL> select  undoblockstotal "Total", undoblocksdone "Done", undoblockstotal-undoblocksdone
  2   "ToDo",decode(cputime,0,'unknown',to_char(sysdate+(((undoblockstotal-undoblocksdone)
  3   / (undoblocksdone / cputime)) / 86400),'yyyy-mm-dd hh24:mi:ss'))
  4  "Estimated time to complete",to_char(sysdate,'yyyy-mm-dd hh24:mi:ss')
  5  from v$fast_start_transactions;
     Total       Done       ToDo Estimated time to c TO_CHAR(SYSDATE,'YY
---------- ---------- ---------- ------------------- -------------------
    306828      15735     291093 2012-02-16 14:04:34 2012-02-16 13:52:33
SQL> /
     Total       Done       ToDo Estimated time to c TO_CHAR(SYSDATE,'YY
---------- ---------- ---------- ------------------- -------------------
    306828      65861     240967 2012-02-16 14:05:15 2012-02-16 13:54:46
--每秒钟回滚undo数据块数量
SQL> select (65861-15735)/133 from dual;
(65861-15735)/133
-----------------
       376.887218
--数据库并发回滚进程数
[oracle@node1 ~]$ ps -ef|grep ora_p0
oracle   25252     1  1 13:51 ?        00:00:00 ora_p000_chf
oracle   25254     1  0 13:51 ?        00:00:00 ora_p001_chf
oracle   25256     1  1 13:51 ?        00:00:00 ora_p002_chf
oracle   25258     1  1 13:51 ?        00:00:00 ora_p003_chf
oracle   25260     1  1 13:51 ?        00:00:00 ora_p004_chf
oracle   25262     1  1 13:51 ?        00:00:00 ora_p005_chf
oracle   25264     1  1 13:51 ?        00:00:00 ora_p006_chf
1.数据库重启后,以前的回滚依然生效(v$fast_start_transactions.undoblockstotal变小)
2.FAST_START_PARALLEL_ROLLBACK=false还是有并发,而非官方文档描述(Parallel rollback is disabled)

5.总结
通过这三种情况下的每秒钟回滚undo数据块数量比较可以知道在LOW状态下最快,HIGH状态下次之,FALSE最慢。其实这个实验没有任何实际说明力,只是想说明几个问题:
1)Oracle大事物回滚,是没有办法取消,但是可以通过FAST_START_PARALLEL_ROLLBACK干预回滚速度
2)数据库的并发效率高于低,取决于系统的资源情况(如果你系统的cpu非常强大,那么可能设置HIGH速度最快)
3)回滚的数据类型,在回滚表中数据时可能设置并发比FALSE快,
但是如果是要回滚串行数据(如:index),那么可能串行方法方式速度更快
4)根据你的系统的使用状况,比如你想让系统的业务受到的影响最小,那么设置FALSE可能是个不错的选择。
6.补充官方说明

FAST_START_PARALLEL_ROLLBACK specifies the degree of parallelism used when recovering terminated transactions.
Terminated transactions are transactions that are active before a system failure.
If a system fails when there are uncommitted parallel DML or DDL transactions,
then you can speed up transaction recovery during startup by using this parameter.
Values:
FALSE
Parallel rollback is disabled
LOW
Limits the maximum degree of parallelism to 2 * CPU_COUNT
HIGH
Limits the maximum degree of parallelism to 4 * CPU_COUNT
Note:If you change the value of this parameter,
then transaction recovery will be stopped and restarted with the new implied degree of parallelism.

遭遇ORA-07445[kkdliac()+346]使用odu抢救数据

1.Oracle启动报错

-bash-2.05b$ sqlplus "/ as sysdba"
SQL*Plus: Release 9.2.0.8.0 - Production on 星期三 2月 15 10:31:53 2012
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.
Connected to an idle instance.
SQL> startup mount
ORACLE instance started.
Total System Global Area 1527846824 bytes
Fixed Size                   452520 bytes
Variable Size             385875968 bytes
Database Buffers         1140850688 bytes
Redo Buffers                 667648 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel

2.alert文件

Wed Feb 15 10:32:06 2012
ALTER DATABASE   MOUNT
Wed Feb 15 10:32:10 2012
Successful mount of redo thread 1, with mount id 412256678
Wed Feb 15 10:32:10 2012
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Wed Feb 15 10:33:44 2012
alter database open
Wed Feb 15 10:33:44 2012
Beginning crash recovery of 1 threads
Wed Feb 15 10:33:44 2012
Started redo scan
Wed Feb 15 10:33:44 2012
Completed redo scan
 0 redo blocks read, 0 data blocks need recovery
Wed Feb 15 10:33:44 2012
Started recovery at
 Thread 1: logseq 2, block 3, scn 2862.4075508322
Wed Feb 15 10:33:44 2012
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: /data1z/oracle/oradata/ahcx216/redo01.log
Wed Feb 15 10:33:44 2012
Completed redo application
Wed Feb 15 10:33:44 2012
Ended recovery at
 Thread 1: logseq 2, block 3, scn 2862.4075528323
 0 data blocks read, 0 data blocks written, 0 redo blocks read
Crash recovery completed successfully
Wed Feb 15 10:33:44 2012
Thread 1 advanced to log sequence 3
Thread 1 opened at log sequence 3
  Current log# 3 seq# 3 mem# 0: /data1z/oracle/oradata/ahcx216/redo03.log
Successful open of redo thread 1
Wed Feb 15 10:33:45 2012
SMON: enabling cache recovery
Wed Feb 15 10:33:45 2012
Errors in file /data1z/oracle/admin/ahcx216/udump/ahcx216_ora_21325.trc:
ORA-07445: exception encountered: core dump [kkdliac()+346] [SIGSEGV] [Address not mapped to object] [0x43] [] []

3.trace文件

/data1z/oracle/admin/ahcx216/udump/ahcx216_ora_21325.trc
Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production
ORACLE_HOME = /data1z/oracle/product/9.2
System name:	Linux
Node name:	aiserch1
Release:	2.4.21-27.ELsmp
Version:	#1 SMP Wed Dec 1 21:59:02 EST 2004
Machine:	i686
Instance name: ahcx216
Redo thread mounted by this instance: 1
Oracle process number: 12
Unix process pid: 21325, image: oracle@aiserch1 (TNS V1-V3)
*** SESSION ID:(11.13) 2012-02-15 10:33:44.739
Thread checkpoint rba:0x000002.00000002.0010 scn:0x0b2e.f2eb5261
Cache low rba is 0xffffffff.ffffffff.ffff
Use incremental checkpoint on-disk rba
Thread 1 recovery from rba:0x000002.00000003.0000 scn:0x0b2e.f2eb5262
----- Recovery Hash Table Statistics ---------
Hash table buckets = 32768
Longest hash chain = 0
Average hash chain = 0/0 = 0.0
Max compares per lookup = 0
Avg compares per lookup = 0/0 = 0.0
----------------------------------------------
*** 2012-02-15 10:33:44.766
KCRA: start recovery claims for 0 data blocks
*** 2012-02-15 10:33:44.766
KCRA: buffers claimed = 0/0, eliminated = 0
*** 2012-02-15 10:33:44.766
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
----- Recovery Hash Table Statistics ---------
Hash table buckets = 32768
Longest hash chain = 0
Average hash chain = 0/0 = 0.0
Max compares per lookup = 0
Avg compares per lookup = 0/0 = 0.0
----------------------------------------------
Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object), addr: 0x43, PC: [0x8880c00, kkdliac()+346]
Registers:
%eax: 0xa0f6b170 %ebx: 0x00000003 %ecx: 0xa0f6b170
%edx: 0x00000001 %edi: 0x00000000 %esi: 0x00000000
%esp: 0xbfff1cb8 %ebp: 0xbfff1cd8 %eip: 0x08880c00
%efl: 0x00010286
  kkdliac()+329 (0x8880bef) jnz 0x8880f61
  kkdliac()+335 (0x8880bf5) mov %edx,%eax
  kkdliac()+337 (0x8880bf7) mov 0xc0(%eax),%edx
  kkdliac()+343 (0x8880bfd) mov %edx,0xffffffe0(%ebp)
> kkdliac()+346 (0x8880c00) movzw 0x40(%ebx),%esi
  kkdliac()+350 (0x8880c04)  mov %si,0xffffffd4(%ebp)
  kkdliac()+354 (0x8880c08) movb 0x42(%ebx),%al
  kkdliac()+357 (0x8880c0b) movb %al,0xffffffd6(%ebp)
  kkdliac()+360 (0x8880c0e) movzw 0x8(%ebx),%edx
*** 2012-02-15 10:33:45.029
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [kkdliac()+346] [SIGSEGV] [Address not mapped to object] [0x43] [] []
Current SQL statement for this session:
create table bootstrap$ ( line#         number not null,   obj#           number not null,
sql_text   varchar2(4000) not null)   storage (initial 50K objno 56 extents (file 1 block 377))
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp()+274         call     ksedst()             1 ? 0 ? 0 ? 1 ? 65252838 ?
                                                   2C297862 ?
ssexhd()+1113        call     ksedmp()             3 ? 0 ? 0 ? 0 ? 0 ? 0 ?
__pthread_sighandle  call     00000000             B ? B75CEC90 ? B75CED10 ? 0 ?
r_rt()+122                                         0 ? 0 ?
kkdliac()+346        signal   00000000             B ? B75CEC90 ? B75CED10 ?
ctcdrv()+1688        call     kkdlcob()            BFFF1DC8 ? 2 ? 0 ? 38 ? 0 ?
                                                   0 ? 0 ?
opiexe()+9647        call     ctcdrv()             B749CD78 ? BFFF3258 ?
                                                   BFFF33B0 ?
opiosq0()+1170       call     opiexe()             4 ? 0 ? BFFF379C ?
opiosq()+19          call     opiosq0()            3 ? F ? BFFF4418 ? 0 ?
opiodr()+1133        call     00000000             4A ? F ? BFFF4418 ?
__PGOSF163_rpidrus(  call     opiodr()             4A ? F ? BFFF4418 ? 2 ?
)+145
skgmstack()+137      call     00000000             BFFF3ED0 ? 2 ? BFFF428C ?
                                                   BFFF3EE8 ? BFFF3ED0 ?
                                                   BFFF3EE8 ?
rpidru()+98          call     skgmstack()          BFFF3EE8 ? AF61BC0 ? F618 ?
                                                   821F376 ? BFFF3ED0 ?
rpiswu2()+315        call     00000000             BFFF428C ? 0 ? 0 ? 0 ?
                                                   BFFF3F64 ? F618 ?
rpidrv()+1087        call     rpiswu2()            9D7416FC ? 0 ? BFFF4394 ? 2 ?
                                                   BFFF43B4 ? 0 ?
rpisplu()+298        call     rpidrv()             2 ? 4A ? BFFF4418 ? 8 ?
                                                   9EA3F16C ? B749DE88 ?
rpispl()+28          call     rpisplu()            2 ? 0 ? 0 ? B749DE88 ? BC ?
                                                   0 ? 0 ?
kqlbebs()+781        call     rpispl()             2 ? 0 ? B749DE88 ? BC ? 0 ?
                                                   0 ?
kqlblfc()+106        call     kqlbebs()            0 ? BFFF5590 ?
adbdrv()+2220        call     kqlblfc()            0 ? BFFF5590 ? BFFF5590 ?
opiexe()+10319       call     adbdrv()             B74A022C ? AF5A904 ?
                                                   B74AD040 ? 1 ? 0 ? 1 ?
opiosq0()+1170       call     opiexe()             4 ? 0 ? BFFF6928 ?
kpooprx()+206        call     opiosq0()            3 ? E ? BFFF6A68 ? 24 ?
kpoal8()+599         call     kpooprx()            BFFF7604 ? BFFF7510 ? 13 ?
                                                   1 ? 0 ? 24 ?
opiodr()+1133        call     00000000             5E ? 14 ? BFFF7600 ?
ttcpip()+4250        call     00000000             5E ? 14 ? BFFF7600 ? 0 ?
opitsk()+1077        call     ttcpip()             AF5A900 ? 5E ? BFFF7600 ? 0 ?
                                                   BFFF87D8 ? BFFF87D4 ?
opiino()+1287        call     opitsk()             0 ? 0 ? AF5A900 ? CC349D8 ?
                                                   F1 ? 0 ?
opiodr()+1133        call     00000000             3C ? 4 ? BFFF9B9C ?
opidrv()+418         call     opiodr()             3C ? 4 ? BFFF9B9C ? 0 ?
sou2o()+30           call     opidrv()             3C ? 4 ? BFFF9B9C ?
main()+187           call     sou2o()              BFFF9B80 ? 3C ? 4 ?
                                                   BFFF9B9C ? 890710 ? 0 ?
__libc_start_main()  call     00000000             2 ? BFFF9C04 ? BFFF9C10 ?
+188                                               890518 ? 2 ? 82174E0 ?
--------------------- Binary Stack Dump ---------------------
从这里看出数据库是在open过程中,创建bootstrap$错误,这个错误对于现在的我来说,暂时没有办法去解决。
为了能够抢救出客户需要的其中一个用户下面的数据,我不得不采用odu来解决问题。

4.填写ontrol.txt文件

SQL> set pagesize 1000
SQL> set linesize 200
SQL> col name for a50
SQL> select file#,ts#,rfile#,name from v$datafile;
     FILE#        TS#     RFILE# NAME
---------- ---------- ---------- ----------------------------------------------
         1          0          1 /data1z/oracle/oradata/ahcx216/system01
         2          1          2 /data1z/oracle/oradata/ahcx216/undotbs01
         3          3          3 /data1z/oracle/oradata/ahcx216/CITY
         4          4          4 /data1z/oracle/oradata/ahcx216/DATATS
         5          5          5 /data1z/oracle/oradata/ahcx216/indx01
         6          6          6 /data1z/oracle/oradata/ahcx216/tools01
         7          7          7 /data1z/oracle/oradata/ahcx216/users
         8          4          8 /data1z/oracle/oradata/ahcx216/datats02
         9          4          9 /data1z/oracle/oradata/ahcx216/datats03
      ………………

5.登录odu

bash-2.05b$ ./odu
Oracle Data Unloader:Release 3.0.8
Copyright (c) 2008,2009 XiongJun. All rights reserved.
Web: http://www.laoxiong.net
Email: magic007cn@gmail.com
loading default config.......
byte_order little
block_size  8192
db_timezone -7
client_timezone 8
data_path   data
charset_name ZHS16GBK
ncharset_name AL16UTF16
output_format text
lob_storage infile
clob_byte_order little
load control file 'config.txt' successful
loading default control file ......
 ts#   fn  rfn bsize   blocks bf offset filename
---- ---- ---- ----- -------- -- ------ --------------------------------------------
   0    1    1  8192    32000 N       0 /data1z/oracle/oradata/ahcx216/system01
   1    2    2  8192   524288 N       0 /data1z/oracle/oradata/ahcx216/undotbs01
   3    3    3  8192   524288 N       0 /data1z/oracle/oradata/ahcx216/CITY
   4    4    4  8192   524288 N       0 /data1z/oracle/oradata/ahcx216/DATATS
   5    5    5  8192   524288 N       0 /data1z/oracle/oradata/ahcx216/indx01
………………
load control file 'control.txt' successful
loading dictionary data......

6.加载数据字典

ODU> unload dict
CLUSTER C_USER# file_no: 1 block_no: 89
TABLE OBJ$ file_no: 1 block_no: 121
CLUSTER C_OBJ# file_no: 1 block_no: 25
CLUSTER C_OBJ# file_no: 1 block_no: 25
found IND$'s obj# 19
found IND$'s dataobj#:2,ts#:0,file#:1,block#:25,tab#:3
found TABPART$'s obj# 230
found TABPART$'s dataobj#:230,ts#:0,file#:1,block#:1657,tab#:0
found INDPART$'s obj# 234
found INDPART$'s dataobj#:234,ts#:0,file#:1,block#:1689,tab#:0
found TABSUBPART$'s obj# 240
found TABSUBPART$'s dataobj#:240,ts#:0,file#:1,block#:1737,tab#:0
found INDSUBPART$'s obj# 245
found INDSUBPART$'s dataobj#:245,ts#:0,file#:1,block#:1777,tab#:0
found IND$'s obj# 19
found IND$'s dataobj#:2,ts#:0,file#:1,block#:25,tab#:3
found LOB$'s obj# 156
found LOB$'s dataobj#:2,ts#:0,file#:1,block#:25,tab#:6
found LOBFRAG$'s obj# 258
found LOBFRAG$'s dataobj#:258,ts#:0,file#:1,block#:1881,tab#:0

7.列出用户

ODU> list user
     USER#   USERNAME
----------   ------------------------------
       …………
        20   HS_ADMIN_ROLE
        25   MAS_ADMIN
        24   HF_CX
        21   WMSYS
        22   WM_ADMIN_ROLE
        23   CITY
        26   BB_ADMIN
        27   AQ_ADMIN
        28   WH_ADMIN
        29   LA_ADMIN
       …………

8.列出用户表

ODU> list table TL_ADMIN
      OBJ#   OBJECT_NAME
----------   ------------------------------
   2286779   WT_MX_201005
     32309   BD_DEALER_308
    126172   CDMA_FAVOUR_BAOYUE
     79675   DAISHENG_TMP
     52634   RYSJ_T_308
     25577   FEE_ID_T
    931167   A136_20080304
   1514084   WANGTONG_MOKUAI_2009
………………

10.导出用户下表

ODU> unload user TL_ADMIN
Unloading user TL_ADMIN's tables.
Unloading table: WT_MX_201005,object ID: 2286779
Unloading segment,storage(Obj#=2286779 DataObj#=2286779 TS#=3 File#=31 Block#=8571 Cluster=0)
25205 rows unloaded
Unloading table: BD_DEALER_308,object ID: 32309
Unloading segment,storage(Obj#=32309 DataObj#=32309 TS#=3 File#=31 Block#=17587 Cluster=0)
60 rows unloaded
Unloading table: CDMA_FAVOUR_BAOYUE,object ID: 126172
Unloading segment,storage(Obj#=126172 DataObj#=126172 TS#=3 File#=3 Block#=30899 Cluster=0)
31 rows unloaded
Unloading table: DAISHENG_TMP,object ID: 79675
Unloading segment,storage(Obj#=79675 DataObj#=342004 TS#=3 File#=31 Block#=19451 Cluster=0)
7504 rows unloaded
…………………………

11.查看导出来数据

-bash-2.05b$ ls -l
-rw-r--r--    1 oracle   dba           658  2月 15 08:43 TL_ADMIN_AMORTIZE.ctl
-rw-r--r--    1 oracle   dba           763  2月 15 08:43 TL_ADMIN_AMORTIZE_HM_COPY.ctl
-rw-r--r--    1 oracle   dba           683  2月 15 08:43 TL_ADMIN_AMORTIZE_HM_COPY.sql
-rw-r--r--    1 oracle   dba         45813  2月 15 08:43 TL_ADMIN_AMORTIZE_HM_COPY.txt
-rw-r--r--    1 oracle   dba           748  2月 15 08:43 TL_ADMIN_AMORTIZE_HM.ctl
-rw-r--r--    1 oracle   dba           686  2月 15 08:43 TL_ADMIN_AMORTIZE_HM.sql
-rw-r--r--    1 oracle   dba        263046  2月 15 08:43 TL_ADMIN_AMORTIZE_HM.txt
-rw-r--r--    1 oracle   dba           777  2月 15 08:43 TL_ADMIN_AMORTIZE_ID.ctl
-rw-r--r--    1 oracle   dba           696  2月 15 08:43 TL_ADMIN_AMORTIZE_ID.sql
……………………………………

12.创建表脚本

-bash-2.05b$ ls -l *.sql
-rw-r--r--    1 oracle   dba           312  2月 15 08:43 TL_ADMIN_A136_20080304.sql
-rw-r--r--    1 oracle   dba           683  2月 15 08:43 TL_ADMIN_AMORTIZE_HM_COPY.sql
-rw-r--r--    1 oracle   dba           686  2月 15 08:43 TL_ADMIN_AMORTIZE_HM.sql
-rw-r--r--    1 oracle   dba           696  2月 15 08:43 TL_ADMIN_AMORTIZE_ID.sql
-rw-r--r--    1 oracle   dba           147  2月 15 08:44 TL_ADMIN_AMORTIZE_OWE.sql
………………
复制这些文件列表,使用UltraEdit编辑文件(recover_create_tab.sql),生成如下脚本
@@TL_ADMIN_A136_20080304.sql
@@TL_ADMIN_AMORTIZE_HM_COPY.sql
@@TL_ADMIN_AMORTIZE_HM.sql
@@TL_ADMIN_AMORTIZE_ID.sql
@@TL_ADMIN_AMORTIZE_OWE.sql
@@TL_ADMIN_AMORTIZE_PAY.sql
然后使用sqlplus登录数据库,直接执行recover_create_tab.sql,创建了所有需要的表

13.加载数据脚本

-bash-2.05b$ ls -l *.ctl
-rw-r--r--    1 oracle   dba           507  2月 15 08:43 TL_ADMIN_A136_20080304.ctl
-rw-r--r--    1 oracle   dba           658  2月 15 08:43 TL_ADMIN_AMORTIZE.ctl
-rw-r--r--    1 oracle   dba           763  2月 15 08:43 TL_ADMIN_AMORTIZE_HM_COPY.ctl
-rw-r--r--    1 oracle   dba           748  2月 15 08:43 TL_ADMIN_AMORTIZE_HM.ctl
-rw-r--r--    1 oracle   dba           777  2月 15 08:43 TL_ADMIN_AMORTIZE_ID.ctl
-rw-r--r--    1 oracle   dba           344  2月 15 08:44 TL_ADMIN_AMORTIZE_OWE.ctl
-rw-r--r--    1 oracle   dba           342  2月 15 08:43 TL_ADMIN_AMORTIZE_PAY.ctl
也同样使用UltraEdit出来,生成文件(recover_load_data.sh)
sqlldr TL_ADMIN/password  control= TL_ADMIN_A136_20080304.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_AMORTIZE.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_AMORTIZE_HM_COPY.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_AMORTIZE_HM.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_AMORTIZE_ID.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_AMORTIZE_OWE.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_AMORTIZE_PAY.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_BD_DEALER_308.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_BRXZ.ctl
然后在shell中执行recover_load_data.sh脚本导入数据,至此数据恢复完成

注意:recover_create_tab.sql和recover_load_data.sh需要在当前抽取出来数据的目录中。

sqlplus版本不正确,导致ORA-12560

sqlplus登录数据库报ORA-12560

C:\Users\XIFENFEI>sqlplus / as sysdba
SQL*Plus: Release 11.2.0.2.0 Production on Tue Feb 14 23:33:31 2012
Copyright (c) 1982, 2010, Oracle.  All rights reserved.
ERROR:
ORA-12560: TNS:protocol adapter error
Enter user-name:
ERROR:
ORA-12560: TNS:protocol adapter error
Enter user-name:
ERROR:
ORA-12560: TNS:protocol adapter error
SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus
C:\Users\XIFENFEI>sqlplus -v
SQL*Plus: Release 11.2.0.2.0 Production

在这里发现sqlplus的版本信息不正确,我的数据库安装的11.2.0.1,这么怎么显示sqlplus是11.2.0.2,然后进入%ORACLE_HOME%/bin目录下面执行sqlplus,登录成功

C:\Users\XIFENFEI>cd e:\oracle\11_2_0\BIN
e:\oracle\11_2_0\BIN>sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on 星期二 2月 14 23:44:38 2012
Copyright (c) 1982, 2010, Oracle.  All rights reserved.
连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
SQL>

出现了这样的问题,考虑应该是环境变量设置不当导致,查看环境变量发现下面一部分

E:\oracle\timesten\bin;E:\oracle\timesten\ttoracle_home\instantclient_11_2;e:\oracle\11_2_0\bin;

问题的答案就是在这里,因为我的电脑是先安装Oracle,前几天安装了TimesTen,导致环境变量发生了改变,使得我在默认情况下使用sqlplus的时候,自动调用了TimesTen中的sqlplus导致

C:\Users\XIFENFEI>cd E:\oracle\timesten\ttoracle_home\instantclient_11_2
E:\oracle\timesten\ttoracle_home\instantclient_11_2>dir sqlplus*
 驱动器 E 中的卷没有标签。
 卷的序列号是 38D0-2A35
 E:\oracle\timesten\ttoracle_home\instantclient_11_2 的目录
2010/11/19  00:50           655,360 sqlplus.exe
2011/11/08  04:12            84,524 sqlplus.sym
2011/11/08  04:12               554 SQLPLUS_README
               3 个文件        740,438 字节
               0 个目录  9,825,832,960 可用字节

清除掉shared pool中某条sql语句方法

在论坛中看到一个帖子,如何清除掉shared pool中某条sql语句,如果是在10g以前的版本,那估计会比较麻烦,为了删除一条sql语句记录,需要清空整个shared pool,在10g中提供了新的dbms_shared_pool包可以实现该功能(如果该包没有安装,可以通过?/rdbms/admin/dbmspool.sql安装),下面我通过在11g中试验证明该问题
1.数据库版本信息

SQL> select * from v$version;
BANNER
----------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for 32-bit Windows: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

2.模拟一条sql语句

SQL> create table xff as
  2  select * from dba_tables
  3  where rownum<10;
表已创建。
SQL> select count(*) from xff;
  COUNT(*)
----------
         9
SQL> col sql_text for a30
SQL> SELECT ADDRESS,HASH_VALUE,SQL_TEXT FROM V$SQLAREA
  2  WHERE SQL_TEXT LIKE 'select % xff';
ADDRESS  HASH_VALUE SQL_TEXT
-------- ---------- ------------------------------
1EFB91B8 3642190903 select count(*) from xff

3.打破神话一:权限操作
对表进行权限操作,可以清除该表在shared pool中关于该表记录

SQL> grant select on xff to chf;
授权成功。
SQL> col sql_text for a30
SQL> SELECT ADDRESS,HASH_VALUE,SQL_TEXT FROM V$SQLAREA
  2  WHERE SQL_TEXT LIKE 'select % xff';
ADDRESS  HASH_VALUE SQL_TEXT
-------- ---------- ------------------------------
1EFB91B8 3642190903 select count(*) from xff
SQL> revoke select on xff from chf;
撤销成功。
SQL> col sql_text for a30
SQL> SELECT ADDRESS,HASH_VALUE,SQL_TEXT FROM V$SQLAREA
  2  WHERE SQL_TEXT LIKE 'select % xff';
ADDRESS  HASH_VALUE SQL_TEXT
-------- ---------- ------------------------------
1EFB91B8 3642190903 select count(*) from xff

证明grant和revoke操作没有清除shared pool中关于该表的sql语句

4.打破神话二:ddl操作
对表进行ddl操作,可以清除该表在shared pool中关于该表记录

SQL> alter table xff add  owner1 varchar2(100);
表已更改。
SQL> SELECT ADDRESS,HASH_VALUE,SQL_TEXT FROM V$SQLAREA
  2  WHERE SQL_TEXT LIKE 'select % xff';
ADDRESS  HASH_VALUE SQL_TEXT
-------- ---------- ------------------------------
1EFB91B8 3642190903 select count(*) from xff

事实证明ddl操作不能达到预期效果,没有清除掉这条sql语句

5.刷新shared pool

SQL> alter system flush shared_pool
  2  ;
系统已更改。
SQL> SELECT ADDRESS,HASH_VALUE,SQL_TEXT FROM V$SQLAREA
  2  WHERE SQL_TEXT LIKE 'select % xff';
未选定行

把整个shared pool都刷新了,自然其中的一条sql语句也没有了,在10g前只能通过这种方法实现

6.使用dbms_shared_pool.purge

SQL> select count(*) from xff;
  COUNT(*)
----------
         9
SQL> SELECT ADDRESS,HASH_VALUE,SQL_TEXT FROM V$SQLAREA
  2  WHERE SQL_TEXT LIKE 'select % xff';
ADDRESS  HASH_VALUE SQL_TEXT
-------- ---------- ------------------------------
1EFB91B8 3642190903 select count(*) from xff
SQL> exec dbms_shared_pool.purge('1EFB91B8, 3642190903','C');
PL/SQL 过程已成功完成。
SQL> SELECT ADDRESS,HASH_VALUE,SQL_TEXT FROM V$SQLAREA
  2  WHERE SQL_TEXT LIKE 'select % xff';
未选定行

试验证明在shared pool中清除了一条sql记录

7.关于dbms_shared_pool.purge参数说明

purge(name varchar2, flag char DEFAULT 'P', heaps number DEFAULT 1);
--    name
--      The name of the object to keep.  There are two kinds of objects:
--      PL/SQL objects, triggers, sequences, types and Java objects,
--      which are specified by name, and
--      SQL cursor objects which are specified by a two-part number
--      (indicating a location in the shared pool).  For example:
--        dbms_shared_pool.keep('scott.hispackage')
--      will keep package HISPACKAGE, owned by SCOTT.  The names for
--      PL/SQL objects follows SQL rules for naming objects (i.e.,
--      delimited identifiers, multi-byte names, etc. are allowed).
--      A cursor can be keeped by
--        dbms_shared_pool.keep('0034CDFF, 20348871', 'C')
--      The complete hexadecimal address must be in the first 8 characters.
--      The value for this identifier is the concatenation of the
--      'address' and 'hash_value' columns from the v$sqlarea view.  This
--      is displayed by the 'sizes' call above.
--      Currently 'TABLE' and 'VIEW' objects may not be keeped.
--
--    flag
--      This is an optional parameter.  If the parameter is not specified,
--        the package assumes that the first parameter is the name of a
--        package/procedure/function and will resolve the name.  Otherwise,
--        the parameter is a character string indicating what kind of object
--        to keep the name identifies.  The string is case insensitive.
--        The possible values and the kinds of objects they indicate are
--        given in the following table:
--          Value        Kind of Object to keep
--          -----      ----------------------
--	      P          package/procedure/function
--	      Q          sequence
--	      R          trigger
--	      T          type
--            JS         java source
--            JC         java class
--	      JR         java resource
--	      JD         java shared data
--	      C          cursor
--      If and only if the first argument is a cursor address and hash-value,
--        the flag parameter should be set to 'C' (or 'c').
--
--    heaps
--      heaps to purge. e.g if heap 0 and heap 6 are to be purged.
--      1<<0 | 1<<6 => hex 0x41 => decimal 65. so specify heaps=>65.
--      Default is 1 i.e heap 0 which means the whole object will be purged.

ORA-600[4194]/[4193]解决

朋友的库启动的时候出现ORA-600[4194]/[4193]错误

Tue Feb 14 09:34:11 2012
Errors in file d:\oracle\product\10.2.0\admin\interlib\bdump\interlib_smon_2784.trc:
ORA-01595: error freeing extent (2) of rollback segment (3))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [6], [30], [], [], [], [], []
Tue Feb 14 09:35:34 2012
Errors in file d:\oracle\product\10.2.0\admin\interlib\udump\interlib_ora_2824.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [4193], [2005], [2008], [], [], [], [], []
ORA-00600: internal error code, arguments: [4193], [2005], [2008], [], [], [], [], []
Tue Feb 14 09:36:30 2012
DEBUG: Replaying xcb 0x1fa24174, pmd 0x1fba06d4 for failed op 8
Doing block recovery for file 2 block 177
No block recovery was needed
Tue Feb 14 09:37:30 2012
Errors in file d:\oracle\product\10.2.0\admin\interlib\bdump\interlib_pmon_2732.trc:
ORA-00600: internal error code, arguments: [4193], [2005], [2008], [], [], [], [], []
Tue Feb 14 09:37:31 2012
Errors in file d:\oracle\product\10.2.0\admin\interlib\bdump\interlib_pmon_2732.trc:
ORA-00600: internal error code, arguments: [4193], [2005], [2008], [], [], [], [], []

从这里可以看到出现了ORA-600[4194]/[4193],第一感觉就是undo出现问题。
4193:表示undo和redo不一致(Arg [a] Undo record seq number,Arg [b] Redo record seq number );
4194:表示也是undo和redo不一致(Arg [a] Maximum Undo record number in Undo block,Arg [b] Undo record number from Redo block)
至于为什么有时候会只出现其中一个,我不太清楚,求答案

直接设置了下面参数,数据库就意外的open成功,这位朋友比较幸运

undo_tablespace=SYSTEM
undo_management=MANUAL

既然库已经open,然后新建undo空间,删除出问题的undo,做如下修改,数据库恢复完成

undo_tablespace=新undo
undo_management=AUTO

如果出现极端的情况可能需要做如下处理:
1.使用_offline_rollback_segments和_corrupted_rollback_segments屏蔽掉有问题的undo segment
2.继续可能出现ora-600[2662],需要推进scn

Query Duration=0与ORA-01555

1.ALERT日志错误
奇怪之处:Query Duration=0 sec,竟然出现了ORA-01555

Tue Feb  7 02:41:34 2012
ORA-01555 caused by SQL statement below (Query Duration=0 sec, SCN: 0x0b2e.efcd78a9):
Tue Feb  7 02:41:34 2012
SELECT "ID_NO","CUST_ID" FROM "DBACCADM"."DCUSTMSG" "C" WHERE "ID_NO"=:1

2.ORA-01555解释
超过了undo_retention时间,undo被覆盖导致ORA-01555

[zwq_acc1:/home/oraeye/check]oerr ora 1555
01555, 00000, "snapshot too old: rollback segment number %s with name \"%s\" too small"
// *Cause: rollback records needed by a reader for consistent read are
//         overwritten by other writers
// *Action: If in Automatic Undo Management mode, increase undo_retention
//          setting. Otherwise, use larger rollback segments

3.数据库版本

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
PL/SQL Release 9.2.0.8.0 - Production
CORE    9.2.0.8.0       Production
TNS for IBM/AIX RISC System/6000: Version 9.2.0.8.0 - Production
NLSRTL Version 9.2.0.8.0 - Production

4.undo基本信息
从这里可以发现,两个节点的undo表空间还有很多剩余,缺发生了undo被覆盖从而出现了ORA-01555

SQL> col name for a20
SQL> col value for a15
SQL> SELECT INST_ID, NAME, VALUE
  2    FROM GV$PARAMETER
  3   WHERE UPPER (Name) LIKE '%UNDO%';
   INST_ID NAME                 VALUE
---------- -------------------- ---------------
         1 undo_management      AUTO
         1 undo_tablespace      UNDOTBS1
         1 undo_suppress_errors FALSE
         1 undo_retention       1800
         2 undo_management      AUTO
         2 undo_tablespace      UNDOTBS2
         2 undo_suppress_errors FALSE
         2 undo_retention       1800
8 rows selected.
TABLESPACE_NAME                CURRENT_TOTAL(MB)   USED(MB)   FREE(MB)      FREE% AUT MAX_TOTAL(MB)
------------------------------ ----------------- ---------- ---------- ---------- --- -------------
UNDOTBS1                                   40950    1587.94 39362.0625      96.12 NO          40950
UNDOTBS2                                   57330    1926.31 55403.6875      96.64 NO          57330
SQL> SELECT DISTINCT STATUS ,
  2                  COUNT(*) "EXTENT_NUM",
  3                  SUM(BYTES) / 1024 / 1024 / 1024 "UNDO(G)"
  4    FROM DBA_UNDO_EXTENTS
  5   GROUP BY STATUS;
STATUS    EXTENT_NUM    UNDO(G)
--------- ---------- ----------
ACTIVE           208 .273658752
EXPIRED         7651 2.42865753
UNEXPIRED        941 .752548218

查询MOS[ID 761128.1],发现可能是Oracle bug导致(BUG:6799685 – ORA-1555 ERROR WITH QUERY DURATION=0 AND UNDO_RETENTION=1800和BUG:5475085 – V$UNDOSTAT.EXPBLKREUCNT IS NEVER INCREMENTED)

5.解决方法
Increase the size of the UNDO tablespace and increase the UNDO_RETENTION parameter value to try to prevent required undo expiring too quickly.
基于本库,因为undo空间还有很大剩余,直接设置UNDO_RETENTION=3600即可(可以从一定程度上缓解整个问题,但是要从根本上解决整个问题,需要升级到10.2.0.4及其以上版本)

StatSpack报告中redo size为负数

在一份statspack报告中发现redo size 为负数

DB Name         DB Id    Instance     Inst Num Release     Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
CRM           3413823439 crm2                2 9.2.0.8.0   YES     zwq_crm2
              Snap Id     Snap Time      Sessions Curs/Sess Comment
            --------- ------------------ -------- --------- -------------------
Begin Snap:     47654 05-Feb-12 11:00:04    2,301      20.0
  End Snap:     47655 05-Feb-12 12:00:02    2,298      20.3
   Elapsed:               59.97 (mins)
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
               Buffer Cache:    36,448M      Std Block Size:          8K
           Shared Pool Size:    10,240M          Log Buffer:     20,480K
Load Profile
~~~~~~~~~~~~                            Per Second       Per Transaction
                                   ---------------       ---------------
  负数=====>      Redo size:           -359,121.01             -7,828.69
              Logical reads:            349,787.58              7,625.22
              Block changes:              5,656.13                123.30
             Physical reads:             12,521.51                272.96
            Physical writes:                427.44                  9.32
                 User calls:             22,223.63                484.47
                     Parses:              4,673.27                101.88
                Hard parses:                 46.78                  1.02
                      Sorts:              4,027.70                 87.80
                     Logons:                 10.57                  0.23
                   Executes:             10,480.35                228.47
               Transactions:                 45.87

一时之间感觉很奇怪,在运行的数据库redo size不可能为负数(只要数据库在运行redo size都应该大于0).查询MOS[ID 308414.1]发现原来是一个bug引起(Bug:1713403 NEGATIVE VALUE IN V$SESSTAT FOR STATISTIC REDO SIZE),sp中的redo size其本质还是来源于V$SESSTAT.

Applies to:
Oracle Server - Enterprise Edition - Version: 8.1.7.4 to 9.2.0.8
This problem can occur on any platform.
Symptoms
Redo Size is displayed as a Negative number in a Statspack report.
For example:
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
             --------------- ---------------
Redo size:        -17,931.33      -55,277.73
Logical reads:     31,095.80       95,860.43
...
Cause
Bug:1713403 NEGATIVE VALUE IN V$SESSTAT FOR STATISTIC REDO SIZE
Overflow of 'redo size' statistic. This is fixed in Oracle10g and above.
Patches do not exist for earlier releases.
Solution
Use 'redo blocks written' instead to measure the amount of redo.
Ignore negative 'redo size'.