1.关于heartbeat和checkpoint
在这次的删除过程中我没有执行commit,而是直接abort数据库。整个删除过程执行了近6分钟,控制文件的心跳每三秒进行一次,心跳是把low cache rba记录到了控制文件中,而没有真正的把全部的脏数据写入到磁盘( 只有发生了checkpoint时候,才会把相关的脏数据写入到磁盘,而这里的控制文件的heartbeat和checkpoint是两回事,checkpoint一般是在切换日志,数据文件正常离线,执行begin backup命令时发生,昨晚晚上后面的一个困惑就是上面的英文描述,让我把这两者搞混淆了)
2.三种rba解释
low rba :在buffer cache中的数据块第一次数据改变所对应的RAB。
high rba :在buffer cache中的数据块最近一次数据改变时所对应的RAB。
on-disk rba:是 lgwr 写日志文件的最末位置的地址。
3.实例恢复过程解释
实例恢复的时候,是从控制文件heartbeat记录的low rba开始读redo log数据(会多读取一点,因为heartbeat是每三秒执行一次,假设在2.9秒的时候,数据库异常down了,控制文件中记录的还是2.9秒前的low rba,这个时候,从该点开始读取redo),恢复到on-disk rba,而不是high rba(high rba一般情况下会大于on-disk rba,但是因为high rba比on-disk rba多的部分记录在redo log buffer中,在实例恢复的时候,因为其未被记录到redo log file中,所以不能被恢复,其实也没有必要恢复,因为该数据肯定是没有commit或者rollback)
具体内容和实验请见:RBA和实例恢复关系
ORA-600 kcratr_nab_less_than_odr故障解决
朋友的数据库服务器出现ORA-00600[kcratr_nab_less_than_odr],不能open数据库
1.open数据库报ORA-00600[kcratr_nab_less_than_odr]
SQL> ALTER DATABASE OPEN; ALTER DATABASE OPEN * 第 1 行出现错误: ORA-00600: 内部错误代码, 参数: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], []
2.查看alert日志
Wed Jan 11 13:56:16 2012 ALTER DATABASE OPEN Beginning crash recovery of 1 threads parallel recovery started with 2 processes Started redo scan Completed redo scan read 54591 KB redo, 0 data blocks need recovery Errors in file d:\dbdms\diag\rdbms\dbdms\dbdms\trace\dbdms_ora_3108.trc (incident=818557): ORA-00600: 内部错误代码, 参数: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], [] Incident details in: d:\dbdms\diag\rdbms\dbdms\dbdms\incident\incdir_818557\dbdms_ora_3936_i818557.trc Aborting crash recovery due to error 600 Errors in file d:\dbdms\diag\rdbms\dbdms\dbdms\trace\dbdms_ora_3108.trc: ORA-00600: 内部错误代码, 参数: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], [] Errors in file d:\dbdms\diag\rdbms\dbdms\dbdms\trace\dbdms_ora_3108.trc: ORA-00600: 内部错误代码, 参数: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], [] ORA-600 signalled during: ALTER DATABASE OPEN... Trace dumping is performing id=[cdmp_20120110214555]
3.查看trace文件
Trace file d:\dbdms\diag\rdbms\dbdms\dbdms\trace\dbdms_ora_3108.trc Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options Windows NT Version V6.1 Service Pack 1 CPU : 2 - type 8664, 2 Physical Cores Process Affinity : 0x0x0000000000000000 Memory (Avail/Total): Ph:2250M/4060M, Ph+PgF:5868M/8119M Instance name: dbdms Redo thread mounted by this instance: 1 Oracle process number: 17 Windows thread id: 3108, image: ORACLE.EXE (SHAD) ………………………… WARNING! Crash recovery of thread 1 seq 99189 is ending at redo block 43531 but should not have ended before redo block 43569 Incident 826550 created, dump file: d:\dbdms\diag\rdbms\dbdms\dbdms\incident\incdir_826550\dbdms_ora_3108_i826550.trc ORA-00600: ??????, ??: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], [] ORA-00600: ??????, ??: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], [] ORA-00600: ??????, ??: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], []
通过alert和trace中的内容可以知道,数据库需要恢复到rba到43569,但是因为某种原因实例恢复的时候,只能利用1 thread 99189 seq#,恢复rba到43531。从而导致数据库无法正常open
This Problem is caused by Storage Problem of the Database Files. The Subsystem (eg. SAN) crashed while the Database was open. The Database then crashed since the Database Files were not accessible anymore. This caused a lost Write into the Online RedoLogs and so Instance Recovery is not possible and raising the ORA-600.
4.解决方法
SQL> SELECT STATUS FROM V$INSTANCE; STATUS ------------ MOUNTED --尝试直接recover database SQL> RECOVER DATABASE ; ORA-00283: 恢复会话因错误而取消 ORA-00264: 不要求恢复 --提示不用恢复 --再打开数据库,还是kcratr_nab_less_than_odr错误警告 SQL> ALTER DATABASE OPEN; ALTER DATABASE OPEN * 第 1 行出现错误: ORA-00600: 内部错误代码, 参数: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], [] --尝试不完全恢复 SQL> RECOVER DATABASE UNTIL CANCEL; ORA-10879: error signaled in parallel recovery slave ORA-01547: 警告: RECOVER 成功但 OPEN RESETLOGS 将出现如下错误 ORA-01152: 文件 1 没有从过旧的备份中还原 ORA-01110: 数据文件 1: 'D:\DBDMS\DATA\SYSTEM01.DBF' --重建控制文件 SQL> ALTER DATABASE BACKUP CONTROLFILE TO TRACE AS 'D:/1.TXT'; 数据库已更改。 SQL> SHUTDOWN IMMEDIATE; ORA-01109: 数据库未打开 已经卸载数据库。 ORACLE 例程已经关闭。 SQL> STARTUP NOMOUNT; ORACLE 例程已经启动。 Total System Global Area 417546240 bytes Fixed Size 2176328 bytes Variable Size 268438200 bytes Database Buffers 138412032 bytes Redo Buffers 8519680 bytes SQL> CREATE CONTROLFILE REUSE DATABASE "DBDMS" NORESETLOGS NOARCHIVELOG 2 MAXLOGFILES 16 3 MAXLOGMEMBERS 3 4 MAXDATAFILES 100 5 MAXINSTANCES 8 6 MAXLOGHISTORY 18688 7 LOGFILE 8 GROUP 1 'D:\DBDMS\LOG\REDO01.LOG' SIZE 50M BLOCKSIZE 512, 9 GROUP 2 'D:\DBDMS\LOG\REDO02.LOG' SIZE 50M BLOCKSIZE 512, 10 GROUP 3 'D:\DBDMS\LOG\REDO03.LOG' SIZE 50M BLOCKSIZE 512 11 DATAFILE 12 'D:\DBDMS\DATA\SYSTEM01.DBF', 13 'D:\DBDMS\DATA\SYSAUX01.DBF', 14 'D:\DBDMS\DATA\RBSG01.DBF', 15 'D:\DBDMS\DATA\DATA01.DBF', 16 'D:\DBDMS\DATA\INDX01.DBF', 17 'D:\DBDMS\DATA\DATA02.DBF', 18 'D:\DBDMS\DATA\DATA03.DBF', 19 'D:\DBDMS\DATA\DATA04.DBF', 20 'D:\DBDMS\DATA\INDX02.DBF', 21 'D:\DBDMS\DATA\SYSTEM02.DBF' 22 CHARACTER SET ZHS16GBK 23 ; 控制文件已创建。 --继续尝试恢复 SQL> RECOVER DATABASE ; 完成介质恢复。 SQL> ALTER DATABASE OPEN; 数据库已更改。 --open成功
在这次恢复中,主要就是重建控制文件,然后直接恢复成功,如果redo有损坏,那么可能需要使用不完全恢复,然后使用resetlogs打开数据库
CAST本质探讨
1.试验前提
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> select FORCE_LOGGING,SUPPLEMENTAL_LOG_DATA_PK,SUPPLEMENTAL_LOG_DATA_UI,
2 SUPPLEMENTAL_LOG_DATA_MIN from v$database;
FOR SUP SUP SUPPLEME
--- --- --- --------
NO NO NO NO
SQL> conn / as sysdba
Connected.
SQL> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /opt/oracle/oradata/archivelog/chf
Oldest online log sequence 1879
Next log sequence to archive 1881
Current log sequence 1881
SQL> select count(*) from t_xifenfei_move;
COUNT(*)
----------
7432085
SQL> select bytes/1024/1024 from user_segments where segment_name='T_XIFENFEI_MOVE';
BYTES/1024/1024
---------------
832
从上面信息可以看到数据库处于归档模式,强制日志和辅助日志为开启,试验测试表t_xifenfei_move有7432085条记录,占用硬盘空间832M
2.常规CAST
SQL> alter system flush buffer_cache;
System altered.
SQL> alter system flush shared_pool;
System altered.
SQL> SET TIMING ON
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 100028
undo change vector size 16172
Elapsed: 00:00:00.06
SQL> create table chf.t_xifenfei_move_CAST tablespace users
2 as
3 select * from chf.t_xifenfei_move;
Table created.
Elapsed: 00:01:58.10
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 873150548
undo change vector size 131384
Elapsed: 00:00:00.01
SQL> select 873150548-100028 "redo size" from dual;
redo size
----------
873050520
SQL> select 131384-16172 "undo size" from dual;
undo size
----------
115212
通过这个可以得出结论,产生redo为873050520,undo为115212
3.普通INSERT
SQL> alter system flush buffer_cache;
System altered.
Elapsed: 00:00:00.06
SQL> alter system flush shared_pool;
System altered.
Elapsed: 00:00:00.01
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 732
undo change vector size 136
Elapsed: 00:00:00.18
SQL> CREATE TABLE t_xifenfei_move_INSERT
2 AS
3 SELECT * FROM T_XIFENFEI_MOVE WHERE 1=0;
Table created.
Elapsed: 00:00:00.32
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 22712
undo change vector size 6512
Elapsed: 00:00:00.02
SQL> INSERT INTO t_xifenfei_move_INSERT
2 SELECT * FROM T_XIFENFEI_MOVE;
7432085 rows created.
Elapsed: 00:01:59.47
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 862261580
undo change vector size 27980508
--redo
SQL> select 22712-732 "create redo size" from dual;
create redo size
----------------
21980
SQL> select 862261580-22712 "insert redo size" from dual;
insert redo size
----------------
862238868
--undo
SQL> select 6512-136 "create undo size" from dual;
create undo size
----------------
6376
SQL> select 27980508-6512 "insert undo size" from dual;
insert undo size
----------------
27973996
通过这个可以得出CREATE TABLE 过程中产生redo:21980,undo:6376,而INSERT 过程中产生的redo:862238868,undo:27973996,整个过程总的产生redo:862260848(862238868+21980),undo:27980372(27973996+6376)
4.INSERT+APPEND
SQL> alter system flush buffer_cache;
System altered.
Elapsed: 00:00:25.19
SQL> alter system flush shared_pool;
System altered.
Elapsed: 00:00:00.04
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 732
undo change vector size 136
Elapsed: 00:00:00.05
SQL> create table chf.t_xifenfei_move_INSERT_A tablespace users
2 as
3 select * from chf.t_xifenfei_move where 1=0;
Table created.
Elapsed: 00:00:00.18
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 21892
undo change vector size 6308
Elapsed: 00:00:00.00
SQL> INSERT /*+ append */INTO t_xifenfei_move_INSERT_A
2 SELECT * FROM T_XIFENFEI_MOVE;
7432085 rows created.
Elapsed: 00:02:26.37
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 872792032
undo change vector size 87764
--redo
SQL> select 21892-732 "create redo size" from dual;
create redo size
----------------
21160
SQL> select 872792032-21892 "insert redo size" from dual;
insert redo size
----------------
872770140
--undo
SQL> select 6308-136 "create undo size" from dual;
create undo size
----------------
6172
SQL> select 87764-6308 "insert undo size" from dual;
insert undo size
----------------
81456
这个过程可以得到结论,create 表的过程产生的redo:21160,undo:6172;insert 表的过程redo:872770140,undo:81456;整个过程产生的redo:872791300(21160+872770140),undo:87628(6172+81456)
5.INSERT+NOLOGGING
SQL> alter system flush buffer_cache;
System altered.
Elapsed: 00:00:02.21
SQL> alter system flush shared_pool;
System altered.
Elapsed: 00:00:00.06
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 780
undo change vector size 136
Elapsed: 00:00:00.06
SQL> create table chf.t_xifenfei_move_INSERT_N tablespace users
2 as
3 select * from chf.t_xifenfei_move where 1=0;
Table created.
Elapsed: 00:00:00.22
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 22056
undo change vector size 6308
Elapsed: 00:00:00.00
SQL> INSERT /*+ NOLOGGING */INTO t_xifenfei_move_INSERT_N
2 SELECT * FROM T_XIFENFEI_MOVE;
7432085 rows created.
Elapsed: 00:02:30.33
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 862123984
undo change vector size 27982876
--redo
SQL> select 22056-780 "create redo size" from dual;
create redo size
----------------
21276
SQL> select 862123984-22056 "insert redo size" from dual;
insert redo size
----------------
862101928
--undo
SQL> select 6308-136 "create undo size" from dual;
create undo size
----------------
6172
SQL> select 27982876-6308 "insert undo size" from dual;
insert undo size
----------------
27976568
这个试验过程得出,create table得到redo:21276,undo:6172,insert table 得到redo:862101928,undo:27976568,整个过程redo:862123204,undo:27982740(27976568+6172)
6.INSERT+NOLOGGING(TABLE)
SQL> alter system flush buffer_cache;
System altered.
Elapsed: 00:00:23.68
SQL> alter system flush shared_pool;
System altered.
Elapsed: 00:00:00.04
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 800
undo change vector size 136
Elapsed: 00:00:00.18
SQL> create table chf.t_xifenfei_move_INSERT_N_new nologging tablespace users
2 as
3 select * from chf.t_xifenfei_move where 1=0;
Table created.
Elapsed: 00:00:00.71
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 23296
undo change vector size 6756
Elapsed: 00:00:00.00
SQL> INSERT INTO t_xifenfei_move_INSERT_N_new
2 SELECT * FROM T_XIFENFEI_MOVE;
7432085 rows created.
Elapsed: 00:02:37.51
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 862031304
undo change vector size 27982776
SQL> select 23296-800 "create redo size" from dual;
create redo size
----------------
22496
SQL> select 862031304-23296 "insert redo size" from dual;
insert redo size
----------------
862008008
SQL> select 6756-136 "create undo size" from dual;
create undo size
----------------
6620
SQL> select 27982776-6756 "insert undo size" from dual;
insert undo size
----------------
27976020
通过该试验得出,create table 产生redo:22496,undo:6620;insert into 产生redo:862008008,undo:27976020;整个过程产生redo:862030504(22496+862008008),undo:27982640(6620+27976020)
7.APPEND+NOLOGGING(TABLE)
SQL> alter system flush buffer_cache;
System altered.
Elapsed: 00:00:23.59
SQL> alter system flush shared_pool;
System altered.
Elapsed: 00:00:00.05
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 780
undo change vector size 136
Elapsed: 00:00:00.04
SQL> create table chf.t_xifenfei_move_INSERT_NA nologging tablespace users
2 as
3 select * from chf.t_xifenfei_move where 1=0;
Table created.
Elapsed: 00:00:00.42
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 22000
undo change vector size 6308
Elapsed: 00:00:00.00
SQL> INSERT /*+APPEND */ INTO t_xifenfei_move_INSERT_NA
2 SELECT * FROM T_XIFENFEI_MOVE;
7432085 rows created.
Elapsed: 00:01:08.92
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 602352
undo change vector size 82756
--redo
SQL> select 22000-780 "create redo size" from dual;
create redo size
----------------
21220
SQL> select 602352-22000 "insert redo size" from dual;
insert redo size
----------------
580352
--undo
SQL> select 6308-136 "create undo size" from dual;
create undo size
----------------
6172
SQL> select 82756-6308 "insert undo size" from dual;
insert undo size
----------------
76448
在这个试验中,create table产生redo:21220,undo:6172,insert into产生redo:580352,undo:76448;整个过程产生的redo:601572(
21220+580352),undo:82620(6172+76448)
8.CAST+NOLOGGING(TABLE)
SQL> alter system flush buffer_cache;
System altered.
Elapsed: 00:00:03.35
SQL> alter system flush shared_pool;
System altered.
Elapsed: 00:00:00.04
SQL>
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 732
undo change vector size 136
Elapsed: 00:00:00.05
SQL> create table chf.t_xifenfei_move_cast_N nologging tablespace users
2 as
3 select * from chf.t_xifenfei_move;
Table created.
Elapsed: 00:00:56.41
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 769848
undo change vector size 124944
SQL> select 769848-732 "redo size" from dual;
redo size
----------
769116
SQL> select 124944-136 "undo size" from dual;
undo size
----------
124808
这个试验产生的redo:769116,undo:124808
9.通过试验得出结论
1)sql hint中的nologgging无效
2)普通的cast(不含hint),其本质是append,无nologgging
3)nologgging(表级别)可以使得cast效率较高
4)nologgging(表级别)+append(hint)可以使得insert效率较高
MOVE和CAST比较(续)
本篇文章是MOVE和CAST比较的续篇,主要是对于第一篇中没有涉及到的redo和undo的情况加以叙述
1.查询move产生redo和undo量
SQL> alter system flush buffer_cache;
System altered.
SQL> alter system flush shared_pool;
System altered.
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------- ----------
redo size 844
undo change vector size 136
SQL> ALTER TABLE CHF.T_XIFENFEI_MOVE MOVE TABLESPACE TEST_OCP;
Table altered.
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------- ----------
redo size 873074928
undo change vector size 110748
--产生redo
SQL> select 873074928-844 "redo size" from dual;
redo size
----------
873074084
--产生undo
SQL> select 110748-136 "undo size" from dual;
undo size
----------
110612
2.查询cast产生redo和undo 大小
SQL> alter system flush buffer_cache;
System altered.
SQL> alter system flush shared_pool;
System altered.
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
and lower(a.name) in
4 5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 776
undo change vector size 136
SQL> create table chf.t_xifenfei_move_new tablespace users
2 as
3 select * from chf.t_xifenfei_move;
Table created.
SQL> select a.name, b.value
2 from v$statname a, v$mystat b
3 where a.statistic# = b.statistic#
4 and lower(a.name) in
5 ('redo size','undo change vector size');
NAME VALUE
---------------------------------------------------------------- ----------
redo size 873017580
undo change vector size 115340
--产生redo
SQL> select 873017580-776 "redo size" from dual;
redo size
----------
873016804
--产生undo
SQL> select 115340-136 "undo size" from dual;
undo size
----------
115204
3.两次实验比较
--redo(分母使用cast操作产生redo)
SQL> select 873074084-873016804 "redo" from dual;
redo
----------
57280
SQL> select 57280/873074084 from dual;
57280/873074084
---------------
.000065607
--undo(分母使用cast操作产生undo)
SQL> select 110612-115204 undo from dual;
undo
-------------
-4592
SQL> select 4592/115204 from dual;
4592/115204
-----------
.039859727
通过这两个的比较可以知道move操作产生的redo多了万分之七不到,undo少了百分之四,这些也是在实验允许的误差范围内,再说move操作还包括了cast的一些后续步骤在其中,所以通过这个验证和上一篇试验(MOVE和CAST比较),基本上可以大胆操作move操作的本质就是全表扫描+append插入数据,操作过程中产生的redo大小几乎和表本身大小(872415232)相等,这个证明,move和cast都是以logging模式运行(数据库本身是非force logging模式)
MOVE和CAST比较
1.创建模拟表
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> create table chf.t_xifenfei_move
2 as
3 select * from dba_objects;
Table created.
SQL> select count(*) from chf.t_xifenfei_move;
COUNT(*)
----------
73585
SQL> create table chf.t_xifenfei_move
2 as
3 select * from dba_objects;
Table created.
SQL> select count(*) from chf.t_xifenfei_move;
COUNT(*)
----------
73585
SQL> DECLARE
2 i NUMBER;
3 BEGIN
4 FOR i IN 1..100 LOOP
5 INSERT INTO chf.t_xifenfei_move
6 select * from dba_objects;
7 END LOOP;
8 COMMIT;
9 END;
10 /
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats('CHF','T_XIFENFEI_MOVE');
PL/SQL procedure successfully completed.
SQL> select bytes from dba_segments where segment_name='T_XIFENFEI_MOVE';
BYTES
----------
872415232
2.测试move
2.1)执行move操作,记录时间
SQL> alter system flush buffer_cache; System altered. SQL> SET TIMING ON; SQL> alter session set events 2 '10046 trace name context forever,level 1'; Session altered. Elapsed: 00:00:00.00 SQL> ALTER TABLE CHF.T_XIFENFEI_MOVE MOVE TABLESPACE USERS; Table altered. Elapsed: 00:02:11.77 SQL> alter session set events 2 '10046 trace name context off'; Session altered. Elapsed: 00:00:00.04 SQL> select d.value||'/'||lower(rtrim(i.instance,chr(0)))||'_ora_'||p.spid||'.trc' trace_file_name from 2 (select p.spid from v$mystat m, v$session s,v$process p where m.statistic# = 1 and s.sid = m.sid and p.addr = s.paddr ) p, 3 (select t.instance from v$thread t,v$parameter v where v.name = 'thread' and(v.value = 0 or t.thread# = to_number(v.value))) i, 4 (select value from v$parameter where name = 'user_dump_dest') d; TRACE_FILE_NAME -------------------------------------------------------------------------------- /opt/oracle/diag/rdbms/chf/chf/trace/chf_ora_4765.trc
从这里可以看出,move操作执行了00:02:11.77
2.2)查看trace内容
[oracle@node1 ~]$ tkprof /opt/oracle/diag/rdbms/chf/chf/trace/chf_ora_4765.trc /tmp/xifenfei_move.txt
TKPROF: Release 11.2.0.3.0 - Development on Tue Jan 10 10:57:59 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
********************************************************************************
SQL ID: c1yk5pv0v1wg1 Plan Hash: 2931676921
ALTER TABLE CHF.T_XIFENFEI_MOVE MOVE TABLESPACE USERS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.05 7 8 0 0
Execute 1 11.29 131.23 105584 106275 115654 7432085
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 11.30 131.29 105591 106283 115654 7432085
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Number of plan statistics captured: 1
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
0 0 0 LOAD AS SELECT (cr=117799 pr=105602 pw=105585 time=131351005 us)
7432085 7432085 7432085 TABLE ACCESS FULL T_XIFENFEI_MOVE (cr=105591 pr=105586 pw=0 time=4735560 us cost=23453 size=720912245 card=7432085)
********************************************************************************
从这里可以看出执行move,其实本质是全表扫描表,然后append方式插入数据,而不是真的数据块拷贝
3.测试CAST
3.1).CAST插入数据过程
SQL> alter system flush buffer_cache;
System altered.
SQL> SET TIMING ON;
SQL> alter session set events
2 ‘10046 trace name context forever,level 1’;
Session altered.
Elapsed: 00:00:00.01
SQL> create table chf.t_xifenfei_move_new tablespace test_ocp
2 as
3 select * from chf.t_xifenfei_move;
Table created.
Elapsed: 00:01:59.22
SQL> alter session set events
2 ‘10046 trace name context off’;
Session altered.
Elapsed: 00:00:00.00
SQL> select d.value||’/’||lower(rtrim(i.instance,chr(0)))||’_ora_’||p.spid||’.trc’ trace_file_name from
2 (select p.spid from v$mystat m, v$session s,v$process p where m.statistic# = 1 and s.sid = m.sid and p.addr = s.paddr ) p,
3 (select t.instance from v$thread t,v$parameter v where v.name = ‘thread’ and(v.value = 0 or t.thread# = to_number(v.value))) i,
4 (select value from v$parameter where name = ‘user_dump_dest’) d;
TRACE_FILE_NAME
——————————————————————————–
/opt/oracle/diag/rdbms/chf/chf/trace/chf_ora_5121.trc
从这里看出cast操作用时:00:01:59.22,比move稍微少,但是cast要实现move完全的功能,还需要表重命名,表授权,编译无效对象等。
3.2)查看trace内容
[oracle@node1 ~]$ tkprof /opt/oracle/diag/rdbms/chf/chf/trace/chf_ora_5121.trc /tmp/xifenfei_create.txt
TKPROF: Release 11.2.0.3.0 - Development on Tue Jan 10 11:08:19 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
********************************************************************************
create table chf.t_xifenfei_move_new tablespace test_ocp
as
select * from chf.t_xifenfei_move
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.02 1 4 0 0
Execute 1 9.85 118.37 105587 106097 112387 7432085
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 9.85 118.40 105588 106101 112387 7432085
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Number of plan statistics captured: 1
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
0 0 0 LOAD AS SELECT (cr=106631 pr=105592 pw=105585 time=118338607 us)
7432085 7432085 7432085 TABLE ACCESS FULL T_XIFENFEI_MOVE (cr=105591 pr=105586 pw=0 time=2935008 us cost=23453 size=720912245 card=7432085)
********************************************************************************
通过这个可以看出,CAST其实本质也是全表扫描,然后append方式插入数据
4.比较move和cast
4.1)通过比较执行时间,cast稍微少,但是还有后续操作需要时间
4.2)通过比较执行计划,两者是一样的
4.3)move操作在整个过程中都会锁表,而cast不会锁住原表(select+where可以减少停业务时间)
4.4)move操作会一次性处理好权限,plsql/view等有效,而cast在rename之后,相关对象可能需要重新编译,重新授权等操作
4.5)cast操作index需要新建(create),而move操作index需要重建(rebuild)
4.6)cast完成后,需要对表重命名,删除原表操作操作,而这个操作move不用
5.选择使用谁
5.1)如果停业务时间够长,建议使用move操作
5.2)如果停业务时间不能太长,可以使用cast+where实现
5.3)如果数据库版本>=10g,且表空间使用local管理,那么可以考虑在不停业务的情况下使用shrink实现类此功能
至于MOVE和CAST在执行过程中,关于产生的redo和undo的比较,请见下篇:MOVE和CAST比较(续)
创建含sysdate的函数index
1.模拟环境
创建表插入数据库
[oracle@node1 ~]$ sqlplus chf/xifenfei
SQL*Plus: Release 11.2.0.3.0 Production on Mon Jan 9 16:27:19 2012
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
SQL> create table t_xifenfei(id number,intime date);
Table created.
SQL> DECLARE
2 i NUMBER;
3 BEGIN
4 FOR i IN 1..1000 LOOP
5 INSERT INTO t_xifenfei VALUES(i,SYSDATE-i);
6 END LOOP;
7 COMMIT;
8 END;
9 /
PL/SQL procedure successfully completed.
SQL> select count(*) from t_xifenfei;
COUNT(*)
----------
1000
SQL> exec dbms_stats.gather_table_stats(USER,'T_XIFENFEI',cascade => TRUE);
PL/SQL procedure successfully completed.
2.无index查询
SQL> set autot trace exp stat
Execution Plan
----------------------------------------------------------
Plan hash value: 548923532
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 120 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T_XIFENFEI | 10 | 120 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(NVL("INTIME",SYSDATE@!)>=TO_DATE(' 2011-12-31 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
770 bytes sent via SQL*Net to client
519 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
9 rows processed
SQL> set autot off
这里只是做了一个简单的查询,因为这个nvl(intime,sysdate)的条件,无法使用正常的index,所以没有建立intime索引的测试。
3.尝试创建index
SQL> create index in_t_xifenfei on t_xifenfei (nvl(intime,sysdate)) online nologging;
create index in_t_xifenfei on t_xifenfei (nvl(intime,sysdate)) online nologging
*
ERROR at line 1:
ORA-01743: only pure functions can be indexed
SQL> !oerr ora 1743
01743, 00000, "only pure functions can be indexed"
// *Cause: The indexed function uses SYSDATE or the user environment.
// *Action: PL/SQL functions must be pure (RNDS, RNPS, WNDS, WNPS). SQL
// expressions must not use SYSDATE, USER, USERENV(), or anything
// else dependent on the session state. NLS-dependent functions
// are OK.
--因为含有sysdate创建函数index失败
SQL> CREATE OR REPLACE FUNCTION f_xifenfei (itime DATE)
2 RETURN DATE
3 IS
4 otime DATE;
5 BEGIN
6 otime:=NVL(itime,SYSDATE);
7 RETURN otime;
8 END;
9 /
Function created.
--想采用自定义函数屏蔽掉sysdate在创建index时候的影响
SQL> create index in_t_xifenfei on t_xifenfei (f_xifenfei(intime)) online nologging;
create index in_t_xifenfei on t_xifenfei (f_xifenfei(intime)) online nologging
*
ERROR at line 1:
ORA-30553: The function is not deterministic
SQL> !oerr ora 30553
30553, 00000, "The function is not deterministic"
// *Cause: The function on which the index is defined is not deterministic
// *Action: If the function is deterministic, mark it DETERMINISTIC. If it
// is not deterministic (it depends on package state, database state,
// current time, or anything other than the function inputs) then
// do not create the index. The values returned by a deterministic
// function should not change even when the function is rewritten or
// recompiled.
--因为函数缺少deterministic不能使用于index上
SQL> CREATE OR REPLACE FUNCTION f_xifenfei (itime DATE)
2 RETURN DATE deterministic
3 IS
4 otime DATE;
5 BEGIN
6 otime:=NVL(itime,SYSDATE);
7 RETURN otime;
8 END;
9 /
Function created.
SQL> create index in_t_xifenfei on t_xifenfei (f_xifenfei(intime)) online nologging;
Index created.
--创建函数index成功
SQL> exec dbms_stats.gather_table_stats(USER,'T_XIFENFEI',cascade => TRUE);
PL/SQL procedure successfully completed.
4.再次查询
确定已经使用函数index,达到在index中使用sysdate函数index的目的。
SQL> set autot on exp stat
SQL> select * from t_xifenfei where f_xifenfei(intime)>=to_date('2011-12-31','yyyy-mm-dd');
Execution Plan
----------------------------------------------------------
Plan hash value: 2005404611
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 200 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T_XIFENFEI | 10 | 200 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IN_T_XIFENFEI | 10 | | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("CHF"."F_XIFENFEI"("INTIME")>=TO_DATE(' 2011-12-31 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
5 consistent gets
0 physical reads
0 redo size
770 bytes sent via SQL*Net to client
519 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
9 rows processed
5.总结说明
5.1)通过函数屏蔽函数index的时候,不能使用sysdate
5.2)在创建函数时,需要指定deterministic关键字
DBA_HIST_TBSPC_SPACE_USAGE查询undo表空间异常BUG
1.数据库版本
SQL> select * from v$version; BANNER ---------------------------------------------------------------- Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi PL/SQL Release 10.2.0.5.0 - Production CORE 10.2.0.5.0 Production TNS for Linux: Version 10.2.0.5.0 - Production NLSRTL Version 10.2.0.5.0 - Production
2.查询DBA_DATA_FILES视图
SQL> col tablespace_name for a15 SQL> SELECT TABLESPACE_NAME, 2 SUM(MAXBYTES) / 1024 5 FROM DBA_DATA_FILES 6 GROUP BY TABLESPACE_NAME 7 UNION 8 SELECT TABLESPACE_NAME, 9 SUM(MAXBYTES) / 1024 12 FROM DBA_TEMP_FILES 13 GROUP BY TABLESPACE_NAME; TABLESPACE_NAME SUM(MAXBYTES)/1024 --------------- ------------------ EXAMPLE 33554416 OGG 5242880 SYSAUX 33554416 SYSTEM 33554416 TEMP 33554416 TS_INDEX_BASE 15728640 TS_PUB_BASE 15728640 UNDOTBS1 33554416 USERS 33554416 9 rows selected.
3.查询DBA_HIST_TBSPC_SPACE_USAGE视图
SQL> col name for a15 SQL> SELECT NAME, TABLESPACE_MAXSIZE 2 FROM DBA_HIST_TBSPC_SPACE_USAGE A, V$TABLESPACE B 3 WHERE A.TABLESPACE_ID = B.TS# 4 AND SNAP_ID = (select MAX(snap_id) FROM DBA_HIST_TBSPC_SPACE_USAGE) 5 ORDER BY NAME ; NAME TABLESPACE_MAXSIZE --------------- ------------------ EXAMPLE 4194302 OGG 655360 SYSAUX 4194302 SYSTEM 4194302 TEMP 4194302 TS_INDEX_BASE 1966080 TS_PUB_BASE 1966080 UNDOTBS1 8388604 USERS 4194302 9 rows selected.
观察者两个视图的运行结果,DBA_HIST_TBSPC_SPACE_USAGE视图收集到的统计大小和实际大小都存在一定的误差,但是UNDO表空间出入太明显(UNDOTBS1),特别是最大值和当前值,几乎是真实大小的两倍
4.排除原因
4.1)收集信息是否是最新
SQL> select MAX(rtime) FROM DBA_HIST_TBSPC_SPACE_USAGE; MAX(RTIME) ------------------------- 01/09/2012 15:00:50
4.2)statistics_level是否被设置为basic
SQL> show parameter statistics_level; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ statistics_level string TYPICAL
通过这两个查询证明,收集信息和statistics_level都是符合要求,那么为什么undo空间的空间信息还是正常的两倍呢?
5.怀疑bug,查询mos

6.查询11.2.0.3中DBA_HIST_TBSPC_SPACE_USAGE是否正常
SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production PL/SQL Release 11.2.0.3.0 - Production CORE 11.2.0.3.0 Production TNS for Linux: Version 11.2.0.3.0 - Production NLSRTL Version 11.2.0.3.0 - Production SQL> col name for a15 SQL> SELECT NAME, TABLESPACE_MAXSIZE 2 FROM DBA_HIST_TBSPC_SPACE_USAGE A, V$TABLESPACE B 3 WHERE A.TABLESPACE_ID = B.TS# 4 AND SNAP_ID = (select MAX(snap_id) FROM DBA_HIST_TBSPC_SPACE_USAGE) 5 ORDER BY NAME ; NAME TABLESPACE_MAXSIZE --------------- ------------------ DRSYS_1 4194302 EXAMPLE 4194302 ODU 8139262 SYSAUX 4194302 SYSTEM 4194302 TEMP 4194302 TEST_OCP 4194302 UNDOTBS01 3938560 USERS 4194302 9 rows selected. SQL> col tablespace_name for a15 SQL> SELECT TABLESPACE_NAME, 2 SUM(MAXBYTES) / 1024 3 5 FROM DBA_DATA_FILES 6 GROUP BY TABLESPACE_NAME 7 UNION 8 SELECT TABLESPACE_NAME, 9 SUM(MAXBYTES) / 1024 12 FROM DBA_TEMP_FILES 13 GROUP BY TABLESPACE_NAME; TABLESPACE_NAME SUM(MAXBYTES)/1024 --------------- ------------------ DRSYS_1 33554416 EXAMPLE 33554416 ODU 65114096 SYSAUX 33554416 SYSTEM 33554416 TEMP 33554416 TEST_OCP 33554416 UNDOTBS01 31457280 USERS 33554416 9 rows selected.
通过对比,发现基本误差不大,确定在该版本,bug7578292已经被修复
topas命令详解
1.AIX系统版本
-bash-3.00$ oslevel 5.3.0.0
2.topas命令
Topas Monitor for host: aix EVENTS/QUEUES FILE/TTY
Sun Jan 8 12:51:10 2012 Interval: 2 Cswitch 5529 Readch 11620
Syscall 6767 Writech 4357.3K
Kernel 1.7 |# | Reads 7 Rawin 1
User 1.6 |# | Writes 40 Ttyout 403
Wait 0.3 |# | Forks 0 Igets 0
Idle 96.4 |############################| Execs 0 Namei 6
Runqueue 0.0 Dirblk 0
Network KBPS I-Pack O-Pack KB-In KB-Out Waitqueue 0.0
en0 0.9 2.0 3.0 0.3 0.7
lo0 0.1 1.0 1.0 0.0 0.0 PAGING MEMORY
Faults 8 Real,MB 1904
Disk Busy% KBPS TPS KB-Read KB-Writ Steals 0 % Comp 46
hdisk0 11.5 4.2K 27.5 0.0 4.2K PgspIn 0 % Noncomp 11
hdisk1 0.0 0.0 0.0 0.0 0.0 PgspOut 0 % Client 11
cd0 0.0 0.0 0.0 0.0 0.0 PageIn 0
PageOut 1064 PAGING SPACE
Name PID CPU% PgSp Owner Sios 1064 Size,MB 512
python 2400440 1.3 6.0 sutl % Used 43
python 2433268 1.2 6.8 sutl NFS (calls/sec) % Free 57
trclogio 1650830 0.3 0.5 matthewH ServerV2 0
dtgreet 340036 0.1 1.3 root ClientV2 0 Press:
topas 1953858 0.1 1.7 xifenfei ServerV3 0 "h" for help
muxatmd 106652 0.0 0.6 root ClientV3 0 "q" to quit
gil 69666 0.0 0.9 root
xmgc 49176 0.0 0.4 root
sshd 2257092 0.0 2.8 pillow
sshd 1101952 0.0 1.6 xifenfei
ping 848098 0.0 0.3 sylaryy
X 331840 0.0 1.6 root
bash 2756856 0.0 0.6 pillow
ctlcallp 1966120 0.0 0.2 aug_gucc
ctlsmsno 1622246 0.0 0.2 aug_gucc
jobcente 1237188 0.0 0.2 aug_gucc
sshd 1744962 0.0 0.9 nanmingy
searchpl 1015886 0.0 0.2 aug_gucc
ctlinfos 254072 0.0 0.3 aug_gucc
msgcente 1511494 0.0 0.2 aug_gucc
3.topas参数列表
Usage: topas [-d number_of_monitored_hot_disks]
[-h show help information]
[-i monitoring_interval_in_seconds]
[-m Use monochrome mode - no colors]
[-n number_of_monitored_hot_network_interfaces]
[-p number_of_monitored_hot_processes]
[-w number_of_monitored_hot_WLM classes]
[-c number_of_monitored_hot_CPUs]
[-P show full-screen Process Display]
[-L show full-screen Logical Partition display]
[-U username - show username owned processes with -P]
[-D show full-screen Disk display]
[-W show full-screen WLM Display]
[-C show full-screen Cross-LPAR (CEC) Panel]
[-R background Cross-LPAR (CEC) Recording mode]
[-I remote monitoring interval in seconds]
[-o availmem=total memory size allocated in MB]
[-o unavailmem=total memory size unallocated in MB]
[-o availcpu=total cpus allocated on the HMC]
[-o unavailcpu=total cpus unallocated on the HMC]
[-o partitions=number of partitions defined on the HMC]
[-o reconfig=reconf interval in sec 30/60(default)/90/120/180/240/300]
[-o poolsize=defined Pool Size required if HMC PUA restricts access]
[-o managedsys=managed system name under which this partition is configured]
[-o hmc=HMC name under which this partition is configured]
HMC configuration and recording information documented in:
/usr/lpp/perfagent/README.perfagent.tools
4.主要参数说明
-d:指定要监控的磁盘数。如果屏幕显示的区域足够大的话,这也是磁盘可显示的最大数。当这个数值超出实际安装的磁盘数,仅仅监控并显示安装磁盘的信息,默认为20.
-i:设置监控时间的间隔(秒),默认为2秒。
-n:设置被监控的网络接口数。这也是在屏幕上显示的最大网络接口数,当这个值超出了实际安装的网络接口数,仅仅监控并显示安装的网络接口信息。参数缺省值为20,如果这个值设为0,将没有网络信息被监控。
-p:指定被监控的进程数。这也是在屏幕上显示的最大进程数。缺省值为20,如果这个值为0,没见进程信息被监控。得到的这些进程信息是topas的主要开销。如果进程信息不需要,通常使用这个参数指定那些不想要的进程信息。
-w:指明被监控的WLM类的值。这也是在屏幕上显示的最大WLM值,当这个值超出了实际安装的WLM数,仅仅显示那些安装的WLM类的信息,缺省值为20,如果这个值为0,将没有信息显示。
-c:指明被监控的CPU数。这也是在屏幕上显示的最大CPU值,当这个值超出了实际安装的CPU数,仅仅显示那些安装的CPU的信息,缺省值为20,如果这个值为0,将没有信息显示。
-P:显示整个全屏进程显示。显示一系列最繁忙的进程,类似于默认显示的一部分,每个进程仅仅显示几列,可以通过任何列排序。
-W:显示全屏WLM(负载管理器)类。顶部显示一系列WLM类,类似于WLM类的一部分显示。
其实topas命令等同于:topas -d20 -i2 -n20 -p20 -w20 -c20
5.一些子命令说明
a 这个键显示所有被监控的(CPU,network,disk,WLM,process)不同部分。
c 这个键在累积报告和最忙的CPU之间进行切换。
d 这个键在最忙的磁盘和系统总的磁盘活动之间进行切换。
h 显示帮助信息。
n 这个键在最忙的接口和系统总的网络活动之间进行切换。
w 这个键在WLM类的开关之间切换。
p 这个键进程的开关之间切换。
P 这个键用全屏进程显示替换默认的显示。
W 这个键用全屏WLM类显示替换默认显示。
f 移动光标到一个WLM类上,这个键可以在WLM屏幕底端显示一系列进程情况。
q 退出程序。
r 更新显示。
6.topas内容说明
6.1)反映CPU使用率和工作状况
User%:进程所执行的用户态的状态占用的CPU百分比;
Kern%:进程所执行的内核态所占用CPU百分比;
Wait%:显示等待IO操作所占用的CPU百分比;
Idel%:显示CPU空闲的百分比。
6.2)反映网络使用率的状况
Interf:网络接口的名称;
KBPS:在监控期间每秒钟通过的以M为单位的总流量,这部分包括每秒钟接收和发送的总和。
I-Pack:在监控期间每秒钟接收到的数据包数;
O-Pack:在监控期间每秒钟发送的数据包数;
KB-In:在监控期间每秒钟接收到的字节数(以K为单位);
KB-Out:在监控期间每秒钟发送的字节数(以K为单位)。
6.3)反映磁盘使用率的状况
Disk:物理磁盘的名称;
Busy%:磁盘繁忙的百分比,即磁盘能满足的最大IOPS(每秒IO操作数)和当前IO数量的比率
KBPS:在监控期间每秒钟读写的字节数(以K为单位);
TPS:每秒钟物理磁盘的数据传输量。一次传输指的是一次I/O请求;
KB-Read:每秒钟从物理磁盘读出的K字节数;
KB-Write:每秒钟向物理磁盘写入的K字节数。
6.4)反映进程的状况
NAME:可执行程序的名称;
Process ID:进程的ID号;
%CPU Utilization:进程的CPU平均使用率,这个值指的是进程在生命周期中的平均使用率;
Paging Space Used:分配给进程的分页空间大小;
Process Owner:拥有这个进程的用户名;
Workload Management(WLM)Class:进程属于哪个WLM class。
6.5)EVENTS/QUEUES 状况
Cswitch:在监控期间每秒钟上下文转换的次数;
Syscall:在监控期间每秒钟运行系统呼叫的总次数;
Reads:在监控期间每秒钟运行读系统呼叫的次数;
Writes:在监控期间每秒钟运行写系统呼叫的次数;
Forks:在监控期间每秒钟运行派生系统呼叫的次数;
Execs:在监控期间每秒钟运行执行系统呼叫的次数;
Runqueue:等待处理器空闲以便运行的线程平均数;
Waitqueue:等待分页完成的线程平均数;
6.6)FILE/TTY 状况
Readch:在监控期间通过读系统呼叫每秒钟读出的字节数;
Writech:在监控期间通过写系统呼叫每秒钟写入的字节数;
Rawin:在监控期间通过TTYs每秒钟读入的裸字节数;
Ttyout:在监控期间每秒钟写入TTYs的字节数;
Igets:在监控期间每秒钟到节点查找例行程序的呼叫数;
Namei:在监控期间每秒钟路径查找例行程序的呼叫数;
Dirblk:在监控期间通过目录搜索例行程序每秒钟扫描到的目录块数;
6.7)PAGING部分显示的是分页每秒钟次数的统计
Faults:在监控期间每秒钟发生的分页错误总数;
Steals:在监控期间被虚拟内存管理器每秒钟偷走的物理内存4k的帧;
PgspIn:在监控期间每秒钟从分页空间中读取的4k分页数;
PgspOut:在监控期间每秒钟写入分页空间的4k分页数;
PageIn:在监控期间每秒钟读取的4k分页数,这包括从文件系统读操作相关的分页活动,从这个值减去PgspIn的值就是每秒钟从文件系统读操作相关的分页活动数;
PageOut:在监控期间每秒钟写入的4k分页数,这包括向文件系统写操作相关的分页活动,从这个值减去PgspOut的值就是每秒钟向文件系统写操作相关的分页活动数;
Sios:在监控期间通过虚拟内存管理器得到的I/O请求数;
6.8)Momory部分显示的是实际内存和在使用中的内存
Real,MB:以M为单位的实际内存;
%Comp:当前分配给计算分页片断的内存占实际内存的百分比。计算分页片断由分页空间产生。
%Nocomp:当前分配非计算分页片断的内存占实际内存的百分比。非计算分页片断包括那些文件空间,数据文件、可执行文件、或者共享库文件。
%Client:当前分配给用来缓冲远程挂载文件的内存占实际内存的百分比。
6.9)PAING SPACE显示分页空间的大小和使用率
Size,MB:系统里所有分页空间的大小,以M为单位;
%Used:当前使用的分页空间占总空间的百分比;
%Free:当前空闲的分页空间占总空间的百分比。
6.10)NFS (calls/sec)状况
Display NFS stats in calls per second
在RAC中expdp 修改Service_Name
检查数据库日志文件,发现在执行expdp操作前后都有修改service_names操作
1.数据库版本信息
SQL> select instance_name from v$instance; INSTANCE_NAME ---------------- ora9i2 SQL> select * from v$version; BANNER ---------------------------------------------------------------- Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi PL/SQL Release 10.2.0.4.0 - Production CORE 10.2.0.4.0 Production TNS for Linux IA64: Version 10.2.0.4.0 - Production NLSRTL Version 10.2.0.4.0 - Production spfile文件中,无service_names配置,证明都是在修改MEMORY中。
2.alert日志内容
Thu Jan 5 01:10:06 2012
The value (30) of MAXTRANS parameter ignored.
Thu Jan 5 01:10:09 2012
ALTER SYSTEM SET service_names='ora9i','SYS$SYS.KUPC$C_2_20120105011007.ORA9I' SCOPE=MEMORY SID='ora9i2';
Thu Jan 5 01:10:09 2012
ALTER SYSTEM SET service_names='SYS$SYS.KUPC$C_2_20120105011007.ORA9I','ora9i','SYS$SYS.KUPC$S_2_20120105011007.ORA9I' SCOPE=MEMORY SID='ora9i2';
kupprdp: master process DM00 started with pid=305, OS id=9526
to execute - SYS.KUPM$MCP.MAIN('SYS_EXPORT_TABLE_05', 'VAS', 'KUPC$C_2_20120105011007', 'KUPC$S_2_20120105011007', 0);
kupprdp: worker process DW01 started with worker id=1, pid=307, OS id=9641
to execute - SYS.KUPW$WORKER.MAIN('SYS_EXPORT_TABLE_05', 'VAS');
kupprdp: worker process DW02 started with worker id=2, pid=308, OS id=9964
to execute - SYS.KUPW$WORKER.MAIN('SYS_EXPORT_TABLE_05', 'VAS');
kupprdp: worker process DW03 started with worker id=3, pid=309, OS id=9966
to execute - SYS.KUPW$WORKER.MAIN('SYS_EXPORT_TABLE_05', 'VAS');
kupprdp: worker process DW04 started with worker id=4, pid=310, OS id=9968
to execute - SYS.KUPW$WORKER.MAIN('SYS_EXPORT_TABLE_05', 'VAS');
Thu Jan 5 01:13:15 2012
ALTER SYSTEM SET service_names='SYS$SYS.KUPC$S_2_20120105011007.ORA9I','ora9i' SCOPE=MEMORY SID='ora9i2';
Thu Jan 5 01:13:16 2012
ALTER SYSTEM SET service_names='ora9i' SCOPE=MEMORY SID='ora9i2';
3.MOS解决信息[ID 1269319.1]
Depending on the version of your database, Patch:8513146 may exist. As of Nov. 25th 2010, this patch exists for: - 10.2.0.4 / IBM AIX on POWER Systems (64-bit) - 10.2.0.4.3 / Linux x86-64 - 10.2.0.5 / Linux x86 and Linux x86-64
RAC中关于"Immediate Kill Session#" bug记录
今天在rac的一个节点上发现很多Immediate Kill Session#的错误,分析记录如下
1.alert日志内容
Sun Jan 1 02:12:28 2012 ALTER SYSTEM SET service_names='' SCOPE=MEMORY SID='ora9i1'; Sun Jan 1 02:12:28 2012 Immediate Kill Session#: 496, Serial#: 51199 Immediate Kill Session: sess: 0x406bfa26b78 OS pid: 12900 Immediate Kill Session#: 497, Serial#: 38504 Immediate Kill Session: sess: 0x406bfa280e0 OS pid: 12496 Immediate Kill Session#: 499, Serial#: 45296 Immediate Kill Session: sess: 0x406bfa2abb0 OS pid: 12467 Immediate Kill Session#: 502, Serial#: 18910 Immediate Kill Session: sess: 0x406bfa2ebe8 OS pid: 28887 Immediate Kill Session#: 503, Serial#: 26631 Immediate Kill Session: sess: 0x406bfa30150 OS pid: 20749 Immediate Kill Session#: 508, Serial#: 63586 Immediate Kill Session: sess: 0x406bfa36c58 OS pid: 27614 Immediate Kill Session#: 512, Serial#: 43388 Immediate Kill Session: sess: 0x406bfa3c1f8 OS pid: 4021 Immediate Kill Session#: 516, Serial#: 33975 Immediate Kill Session: sess: 0x406bfa41798 OS pid: 18481 Immediate Kill Session#: 517, Serial#: 24240 Immediate Kill Session: sess: 0x406bfa42d00 OS pid: 823 Immediate Kill Session#: 526, Serial#: 59767 Immediate Kill Session: sess: 0x406bfa4eda8 OS pid: 12529 Immediate Kill Session#: 527, Serial#: 45765 Immediate Kill Session: sess: 0x406bfa50310 OS pid: 6059 …………………… Sun Jan 1 02:22:29 2012 ALTER SYSTEM SET service_names='ora9i' SCOPE=MEMORY SID='ora9i1';
2.数据库配置
2.1)A节点相关配置
SQL> select instance_name from v$instance; INSTANCE_NAME ---------------- ora9i1 SQL> select * from v$version; BANNER ---------------------------------------------------------------- Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi PL/SQL Release 10.2.0.4.0 - Production CORE 10.2.0.4.0 Production TNS for Linux IA64: Version 10.2.0.4.0 - Production NLSRTL Version 10.2.0.4.0 - Production SQL> show parameter name; NAME TYPE VALUE ------------------------------------ ---------- -------------------- db_file_name_convert string db_name string ora9i db_unique_name string ora9i global_names boolean FALSE instance_name string ora9i1 lock_name_space string log_file_name_convert string service_names string ora9i
2.2)B节点相关配置
SQL> select instance_name from v$instance;
INSTANCE_NAME
----------------
ora9i2
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for Linux IA64: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL> show parameter name;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_file_name_convert string
db_name string ora9i
db_unique_name string ora9i
global_names boolean FALSE
instance_name string ora9i2
lock_name_space string
log_file_name_convert string
service_names string SYS$SYS.KUPC$C_2_2012010601100
6.ORA9I, ora9i, SYS$SYS.KUPC$S
_2_20120106011006.ORA9I
3.查看MOS,寻找解决方案
3.1)产生该问题原因
This is caused by unpublished Bug 6955040 ALL THE SESSIONS LOST CONNECTION AFTER KILLING CRSD.BIN. The problem is when CRSD is killed or crashed and restarted, CRSD will run resource check action but CRS resource status will not be available at that time. Then in instance check action, it fails to get the preferred node VIP resource status and considered the preferred node VIP resource is not running. Therefore, instance check action will remove the default database service name and disconnect sessions connected using default database service name. This causes messages "ALTER SYSTEM" and "Immediate Kill Session" printed in alert log.
3.2)解决方案
1) The fix is included in 10.2.0.5 patchset and 11.1.0.7 patchset.
Apply the patchset once they are available.
OR
2) Configure a service name other than the default one (same as db_name),
and get user to use the non-default service name for connection.