topas命令详解

1.AIX系统版本

-bash-3.00$ oslevel
5.3.0.0

2.topas命令

Topas Monitor for host:    aix                  EVENTS/QUEUES    FILE/TTY
Sun Jan  8 12:51:10 2012   Interval:  2         Cswitch    5529  Readch    11620
                                                Syscall    6767  Writech 4357.3K
Kernel    1.7   |#                           |  Reads         7  Rawin         1
User      1.6   |#                           |  Writes       40  Ttyout      403
Wait      0.3   |#                           |  Forks         0  Igets         0
Idle     96.4   |############################|  Execs         0  Namei         6
                                                Runqueue    0.0  Dirblk        0
Network  KBPS   I-Pack  O-Pack   KB-In  KB-Out  Waitqueue   0.0
en0       0.9      2.0     3.0     0.3     0.7
lo0       0.1      1.0     1.0     0.0     0.0  PAGING           MEMORY
                                                Faults        8  Real,MB    1904
Disk    Busy%     KBPS     TPS KB-Read KB-Writ  Steals        0  % Comp     46
hdisk0   11.5     4.2K   27.5     0.0     4.2K  PgspIn        0  % Noncomp  11
hdisk1    0.0     0.0     0.0     0.0     0.0   PgspOut       0  % Client   11
cd0       0.0     0.0     0.0     0.0     0.0   PageIn        0
                                                PageOut    1064  PAGING SPACE
Name            PID  CPU%  PgSp Owner           Sios       1064  Size,MB     512
python      2400440   1.3   6.0 sutl                             % Used     43
python      2433268   1.2   6.8 sutl            NFS (calls/sec)  % Free     57
trclogio    1650830   0.3   0.5 matthewH        ServerV2       0
dtgreet      340036   0.1   1.3 root            ClientV2       0   Press:
topas       1953858   0.1   1.7 xifenfei        ServerV3       0   "h" for help
muxatmd      106652   0.0   0.6 root            ClientV3       0   "q" to quit
gil           69666   0.0   0.9 root
xmgc          49176   0.0   0.4 root
sshd        2257092   0.0   2.8 pillow
sshd        1101952   0.0   1.6 xifenfei
ping         848098   0.0   0.3 sylaryy
X            331840   0.0   1.6 root
bash        2756856   0.0   0.6 pillow
ctlcallp    1966120   0.0   0.2 aug_gucc
ctlsmsno    1622246   0.0   0.2 aug_gucc
jobcente    1237188   0.0   0.2 aug_gucc
sshd        1744962   0.0   0.9 nanmingy
searchpl    1015886   0.0   0.2 aug_gucc
ctlinfos     254072   0.0   0.3 aug_gucc
msgcente    1511494   0.0   0.2 aug_gucc

3.topas参数列表

 Usage: topas [-d number_of_monitored_hot_disks]
              [-h show help information]
              [-i monitoring_interval_in_seconds]
              [-m Use monochrome mode - no colors]
              [-n number_of_monitored_hot_network_interfaces]
              [-p number_of_monitored_hot_processes]
              [-w number_of_monitored_hot_WLM classes]
              [-c number_of_monitored_hot_CPUs]
              [-P show full-screen Process Display]
              [-L show full-screen Logical Partition display]
              [-U username - show username owned processes with -P]
              [-D show full-screen Disk display]
              [-W show full-screen WLM Display]
              [-C show full-screen Cross-LPAR (CEC) Panel]
              [-R background Cross-LPAR (CEC) Recording mode]
              [-I remote monitoring interval in seconds]
              [-o availmem=total memory size allocated in MB]
              [-o unavailmem=total memory size unallocated in MB]
              [-o availcpu=total cpus allocated on the HMC]
              [-o unavailcpu=total cpus unallocated on the HMC]
              [-o partitions=number of partitions defined on the HMC]
              [-o reconfig=reconf interval in sec 30/60(default)/90/120/180/240/300]
              [-o poolsize=defined Pool Size required if HMC PUA restricts access]
              [-o managedsys=managed system name under which this partition is configured]
              [-o hmc=HMC name under which this partition is configured]
              HMC configuration and recording information documented in:
                  /usr/lpp/perfagent/README.perfagent.tools

4.主要参数说明
-d:指定要监控的磁盘数。如果屏幕显示的区域足够大的话,这也是磁盘可显示的最大数。当这个数值超出实际安装的磁盘数,仅仅监控并显示安装磁盘的信息,默认为20.
-i:设置监控时间的间隔(秒),默认为2秒。
-n:设置被监控的网络接口数。这也是在屏幕上显示的最大网络接口数,当这个值超出了实际安装的网络接口数,仅仅监控并显示安装的网络接口信息。参数缺省值为20,如果这个值设为0,将没有网络信息被监控。
-p:指定被监控的进程数。这也是在屏幕上显示的最大进程数。缺省值为20,如果这个值为0,没见进程信息被监控。得到的这些进程信息是topas的主要开销。如果进程信息不需要,通常使用这个参数指定那些不想要的进程信息。
-w:指明被监控的WLM类的值。这也是在屏幕上显示的最大WLM值,当这个值超出了实际安装的WLM数,仅仅显示那些安装的WLM类的信息,缺省值为20,如果这个值为0,将没有信息显示。
-c:指明被监控的CPU数。这也是在屏幕上显示的最大CPU值,当这个值超出了实际安装的CPU数,仅仅显示那些安装的CPU的信息,缺省值为20,如果这个值为0,将没有信息显示。
-P:显示整个全屏进程显示。显示一系列最繁忙的进程,类似于默认显示的一部分,每个进程仅仅显示几列,可以通过任何列排序。
-W:显示全屏WLM(负载管理器)类。顶部显示一系列WLM类,类似于WLM类的一部分显示。
其实topas命令等同于:topas -d20 -i2 -n20 -p20 -w20 -c20

5.一些子命令说明
a 这个键显示所有被监控的(CPU,network,disk,WLM,process)不同部分。
c 这个键在累积报告和最忙的CPU之间进行切换。
d 这个键在最忙的磁盘和系统总的磁盘活动之间进行切换。
h 显示帮助信息。
n 这个键在最忙的接口和系统总的网络活动之间进行切换。
w 这个键在WLM类的开关之间切换。
p 这个键进程的开关之间切换。
P 这个键用全屏进程显示替换默认的显示。
W 这个键用全屏WLM类显示替换默认显示。
f 移动光标到一个WLM类上,这个键可以在WLM屏幕底端显示一系列进程情况。
q 退出程序。
r 更新显示。

6.topas内容说明
6.1)反映CPU使用率和工作状况
User%:进程所执行的用户态的状态占用的CPU百分比;
Kern%:进程所执行的内核态所占用CPU百分比;
Wait%:显示等待IO操作所占用的CPU百分比;
Idel%:显示CPU空闲的百分比。

6.2)反映网络使用率的状况
Interf:网络接口的名称;
KBPS:在监控期间每秒钟通过的以M为单位的总流量,这部分包括每秒钟接收和发送的总和。
I-Pack:在监控期间每秒钟接收到的数据包数;
O-Pack:在监控期间每秒钟发送的数据包数;
KB-In:在监控期间每秒钟接收到的字节数(以K为单位);
KB-Out:在监控期间每秒钟发送的字节数(以K为单位)。

6.3)反映磁盘使用率的状况
Disk:物理磁盘的名称;
Busy%:磁盘繁忙的百分比,即磁盘能满足的最大IOPS(每秒IO操作数)和当前IO数量的比率
KBPS:在监控期间每秒钟读写的字节数(以K为单位);
TPS:每秒钟物理磁盘的数据传输量。一次传输指的是一次I/O请求;
KB-Read:每秒钟从物理磁盘读出的K字节数;
KB-Write:每秒钟向物理磁盘写入的K字节数。

6.4)反映进程的状况
NAME:可执行程序的名称;
Process ID:进程的ID号;
%CPU Utilization:进程的CPU平均使用率,这个值指的是进程在生命周期中的平均使用率;
Paging Space Used:分配给进程的分页空间大小;
Process Owner:拥有这个进程的用户名;
Workload Management(WLM)Class:进程属于哪个WLM class。

6.5)EVENTS/QUEUES 状况
Cswitch:在监控期间每秒钟上下文转换的次数;
Syscall:在监控期间每秒钟运行系统呼叫的总次数;
Reads:在监控期间每秒钟运行读系统呼叫的次数;
Writes:在监控期间每秒钟运行写系统呼叫的次数;
Forks:在监控期间每秒钟运行派生系统呼叫的次数;
Execs:在监控期间每秒钟运行执行系统呼叫的次数;
Runqueue:等待处理器空闲以便运行的线程平均数;
Waitqueue:等待分页完成的线程平均数;

6.6)FILE/TTY 状况
Readch:在监控期间通过读系统呼叫每秒钟读出的字节数;
Writech:在监控期间通过写系统呼叫每秒钟写入的字节数;
Rawin:在监控期间通过TTYs每秒钟读入的裸字节数;
Ttyout:在监控期间每秒钟写入TTYs的字节数;
Igets:在监控期间每秒钟到节点查找例行程序的呼叫数;
Namei:在监控期间每秒钟路径查找例行程序的呼叫数;
Dirblk:在监控期间通过目录搜索例行程序每秒钟扫描到的目录块数;

6.7)PAGING部分显示的是分页每秒钟次数的统计
Faults:在监控期间每秒钟发生的分页错误总数;
Steals:在监控期间被虚拟内存管理器每秒钟偷走的物理内存4k的帧;
PgspIn:在监控期间每秒钟从分页空间中读取的4k分页数;
PgspOut:在监控期间每秒钟写入分页空间的4k分页数;
PageIn:在监控期间每秒钟读取的4k分页数,这包括从文件系统读操作相关的分页活动,从这个值减去PgspIn的值就是每秒钟从文件系统读操作相关的分页活动数;
PageOut:在监控期间每秒钟写入的4k分页数,这包括向文件系统写操作相关的分页活动,从这个值减去PgspOut的值就是每秒钟向文件系统写操作相关的分页活动数;
Sios:在监控期间通过虚拟内存管理器得到的I/O请求数;

6.8)Momory部分显示的是实际内存和在使用中的内存
Real,MB:以M为单位的实际内存;
%Comp:当前分配给计算分页片断的内存占实际内存的百分比。计算分页片断由分页空间产生。
%Nocomp:当前分配非计算分页片断的内存占实际内存的百分比。非计算分页片断包括那些文件空间,数据文件、可执行文件、或者共享库文件。
%Client:当前分配给用来缓冲远程挂载文件的内存占实际内存的百分比。

6.9)PAING SPACE显示分页空间的大小和使用率
Size,MB:系统里所有分页空间的大小,以M为单位;
%Used:当前使用的分页空间占总空间的百分比;
%Free:当前空闲的分页空间占总空间的百分比。

6.10)NFS (calls/sec)状况
Display NFS stats in calls per second

在RAC中expdp 修改Service_Name

检查数据库日志文件,发现在执行expdp操作前后都有修改service_names操作
1.数据库版本信息

SQL>  select instance_name from v$instance;
INSTANCE_NAME
----------------
ora9i2
SQL>  select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux IA64: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
spfile文件中,无service_names配置,证明都是在修改MEMORY中。

2.alert日志内容

Thu Jan  5 01:10:06 2012
The value (30) of MAXTRANS parameter ignored.
Thu Jan  5 01:10:09 2012
ALTER SYSTEM SET service_names='ora9i','SYS$SYS.KUPC$C_2_20120105011007.ORA9I' SCOPE=MEMORY SID='ora9i2';
Thu Jan  5 01:10:09 2012
ALTER SYSTEM SET service_names='SYS$SYS.KUPC$C_2_20120105011007.ORA9I','ora9i','SYS$SYS.KUPC$S_2_20120105011007.ORA9I' SCOPE=MEMORY SID='ora9i2';
kupprdp: master process DM00 started with pid=305, OS id=9526
         to execute - SYS.KUPM$MCP.MAIN('SYS_EXPORT_TABLE_05', 'VAS', 'KUPC$C_2_20120105011007', 'KUPC$S_2_20120105011007', 0);
kupprdp: worker process DW01 started with worker id=1, pid=307, OS id=9641
         to execute - SYS.KUPW$WORKER.MAIN('SYS_EXPORT_TABLE_05', 'VAS');
kupprdp: worker process DW02 started with worker id=2, pid=308, OS id=9964
         to execute - SYS.KUPW$WORKER.MAIN('SYS_EXPORT_TABLE_05', 'VAS');
kupprdp: worker process DW03 started with worker id=3, pid=309, OS id=9966
         to execute - SYS.KUPW$WORKER.MAIN('SYS_EXPORT_TABLE_05', 'VAS');
kupprdp: worker process DW04 started with worker id=4, pid=310, OS id=9968
         to execute - SYS.KUPW$WORKER.MAIN('SYS_EXPORT_TABLE_05', 'VAS');
Thu Jan  5 01:13:15 2012
ALTER SYSTEM SET service_names='SYS$SYS.KUPC$S_2_20120105011007.ORA9I','ora9i' SCOPE=MEMORY SID='ora9i2';
Thu Jan  5 01:13:16 2012
ALTER SYSTEM SET service_names='ora9i' SCOPE=MEMORY SID='ora9i2';

3.MOS解决信息[ID 1269319.1]

Depending on the version of your database, Patch:8513146 may exist.
As of Nov. 25th 2010, this patch exists for:
- 10.2.0.4 / IBM AIX on POWER Systems (64-bit)
- 10.2.0.4.3 / Linux x86-64
- 10.2.0.5 / Linux x86 and Linux x86-64

RAC中关于"Immediate Kill Session#" bug记录

今天在rac的一个节点上发现很多Immediate Kill Session#的错误,分析记录如下
1.alert日志内容

Sun Jan  1 02:12:28 2012
ALTER SYSTEM SET service_names='' SCOPE=MEMORY SID='ora9i1';
Sun Jan  1 02:12:28 2012
Immediate Kill Session#: 496, Serial#: 51199
Immediate Kill Session: sess: 0x406bfa26b78  OS pid: 12900
Immediate Kill Session#: 497, Serial#: 38504
Immediate Kill Session: sess: 0x406bfa280e0  OS pid: 12496
Immediate Kill Session#: 499, Serial#: 45296
Immediate Kill Session: sess: 0x406bfa2abb0  OS pid: 12467
Immediate Kill Session#: 502, Serial#: 18910
Immediate Kill Session: sess: 0x406bfa2ebe8  OS pid: 28887
Immediate Kill Session#: 503, Serial#: 26631
Immediate Kill Session: sess: 0x406bfa30150  OS pid: 20749
Immediate Kill Session#: 508, Serial#: 63586
Immediate Kill Session: sess: 0x406bfa36c58  OS pid: 27614
Immediate Kill Session#: 512, Serial#: 43388
Immediate Kill Session: sess: 0x406bfa3c1f8  OS pid: 4021
Immediate Kill Session#: 516, Serial#: 33975
Immediate Kill Session: sess: 0x406bfa41798  OS pid: 18481
Immediate Kill Session#: 517, Serial#: 24240
Immediate Kill Session: sess: 0x406bfa42d00  OS pid: 823
Immediate Kill Session#: 526, Serial#: 59767
Immediate Kill Session: sess: 0x406bfa4eda8  OS pid: 12529
Immediate Kill Session#: 527, Serial#: 45765
Immediate Kill Session: sess: 0x406bfa50310  OS pid: 6059
……………………
Sun Jan  1 02:22:29 2012
ALTER SYSTEM SET service_names='ora9i' SCOPE=MEMORY SID='ora9i1';

2.数据库配置
2.1)A节点相关配置

SQL> select instance_name from v$instance;
INSTANCE_NAME
----------------
ora9i1
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux IA64: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL> show parameter name;
NAME                                 TYPE       VALUE
------------------------------------ ---------- --------------------
db_file_name_convert                 string
db_name                              string     ora9i
db_unique_name                       string     ora9i
global_names                         boolean    FALSE
instance_name                        string     ora9i1
lock_name_space                      string
log_file_name_convert                string
service_names                        string     ora9i

2.2)B节点相关配置

SQL>  select instance_name from v$instance;
INSTANCE_NAME
----------------
ora9i2
SQL>  select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux IA64: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL> show parameter name;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_file_name_convert                 string
db_name                              string      ora9i
db_unique_name                       string      ora9i
global_names                         boolean     FALSE
instance_name                        string      ora9i2
lock_name_space                      string
log_file_name_convert                string
service_names                        string      SYS$SYS.KUPC$C_2_2012010601100
                                                 6.ORA9I, ora9i, SYS$SYS.KUPC$S
                                                 _2_20120106011006.ORA9I

3.查看MOS,寻找解决方案
3.1)产生该问题原因

This is caused by unpublished Bug 6955040 ALL THE SESSIONS LOST CONNECTION AFTER KILLING CRSD.BIN.
The problem is when CRSD is killed or crashed and restarted,
CRSD will run resource check action but CRS resource status will not be available at that time.
Then in instance check action,
it fails to get the preferred node VIP resource status and considered the preferred node VIP resource is not running.
Therefore, instance check action will remove the default database service name
and disconnect sessions connected using default database service name.
This causes messages "ALTER SYSTEM" and "Immediate Kill Session" printed in alert log.

3.2)解决方案

1) The fix is included in 10.2.0.5 patchset and 11.1.0.7 patchset.
    Apply the patchset once they are available.
OR
2) Configure a service name other than the default one (same as db_name),
and get user to use the non-default service name for connection.

记录AUTO_SPACE_ADVISOR_JOB导致负载异常

早上上班,检查数据库,发现监控日志中在晚上1点到4点钟服务器异常负载现象,查看awr日志发现AUTO_SPACE_ADVISOR_JOB运行异常
0.数据库版本

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

1.资源使用情况
cpu情况

逻辑读情况

物理读情况

通过这些可以看出,在晚上的时间点,AUTO_SPACE_ADVISOR_JOB占用了系统的资源的大部分,导致系统负载过高,也许是出现告警的主要原因。

2.查看运行情况

SQL> col status FOR A10
SQL> COL RUN_DURATION FOR A20
SQL> COL start_date FOR A20
SQL> COL log_date FOR A20
SQL> SELECT status,TO_CHAR(ACTUAL_START_DATE,'YYYY-MM-DD HH24:MI:SS') start_date,
  2  TO_CHAR (log_date, 'YYYY-MM-DD HH24:MI:SS') log_date,RUN_DURATION
  3  FROM dba_scheduler_job_run_details
  4  WHERE job_name = 'AUTO_SPACE_ADVISOR_JOB' order by 3;
STATUS     START_DATE           LOG_DATE             RUN_DURATION
---------- -------------------- -------------------- --------------------
SUCCEEDED  2011-12-31 00:00:02  2011-12-31 00:03:05  +000 00:03:03
SUCCEEDED  2012-01-02 00:00:03  2012-01-02 00:03:05  +000 00:03:03
SUCCEEDED  2012-01-03 00:00:02  2012-01-03 00:02:17  +000 00:02:15
SUCCEEDED  2012-01-04 00:00:02  2012-01-04 00:01:41  +000 00:01:39
SUCCEEDED  2012-01-05 00:01:14  2012-01-05 04:02:05  +000 04:00:51

从这里看出,平时AUTO_SPACE_ADVISOR_JOB运行时间只有3分钟左右的时间就可以结束,昨天异常的运行了4个小时。

3.了解AUTO_SPACE_ADVISOR_JOB作用

SQL> select COMMENTS from dba_scheduler_jobs
  2  where job_name='AUTO_SPACE_ADVISOR_JOB';
COMMENTS
-------------------------------------------
auto space advisor maintenance job

从这里看出,该job的主要作用是是用于segment advisor,如果不使用该功能,可以暂时使用下面语句关闭该job

execute dbms_scheduler.disable('AUTO_SPACE_ADVISOR_JOB');

4.对于AUTO_SPACE_ADVISOR_JOB总结
因为该job的在10.2的某些版本中出现类此bug情况,查看mos发现在10.2.0.4中已经修复,但是我这里因为只是出现了一次,暂时不能定位是bug还是数据库偶尔异常,继续观察,如果再出现类此现象,可以采取临时关闭该job的方式处理。

Liunx系统中Oracle使用HugePages配置

一.系统当前状态

[root@node1 ora11g]#  cat /proc/meminfo | grep -i hugepages
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB
[root@node1 ora11g]# uname -sr
Linux 2.6.18-238.19.1.el5
cat /proc/meminfo|grep PageT
PageTables:      44748 kB

二.计算nr_hugepages值

#!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
#
# This script is provided by Doc ID 401749.1 from My Oracle Support
# http://support.oracle.com
# Welcome text
echo "
This script is provided by Doc ID 401749.1 from My Oracle Support
(http://support.oracle.com) where it is intended to compute values for
the recommended HugePages/HugeTLB configuration for the current shared
memory segments. Before proceeding with the execution please make sure
that:
 * Oracle Database instance(s) are up and running
 * Oracle Database 11g Automatic Memory Management (AMM) is not setup
   (See Doc ID 749851.1)
 * The shared memory segments can be listed by command:
     # ipcs -m
Press Enter to proceed..."
read
# Check for the kernel version
KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`
# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`
# Initialize the counter
NUM_PG=0
# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | awk '{print $5}' | grep "[0-9][0-9]*"`
do
   MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
   if [ $MIN_PG -gt 0 ]; then
      NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
   fi
done
RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`
# An SGA less than 100MB does not make sense
# Bail out if that is the case
if [ $RES_BYTES -lt 100000000 ]; then
   echo "***********"
   echo "** ERROR **"
   echo "***********"
   echo "Sorry! There are not enough total of shared memory segments allocated for
HugePages configuration. HugePages can only be used for shared memory segments
that you can list by command:
   # ipcs -m
of a size that can match an Oracle Database SGA. Please make sure that:
 * Oracle Database instance is up and running
 * Oracle Database 11g Automatic Memory Management (AMM) is not configured"
   exit 1
fi
# Finish with results
case $KERN in
   '2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
          echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
   '2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
    *) echo "Unrecognized kernel version $KERN. Exiting." ;;
esac
# End

1.Configuring the Script

Create a text file named hugepages_settings.sh
Copy the contents below in the file
Run
	$ chmod +x hugepages_settings.sh

2.Running the Script

Be sure that all applications that are meant to use HugePage / HugeTLB are running at the time the script is to be run.
This includes the Oracle RDBMS instances and ASM instances in addition to other applications.
Be sure that you have /bin and /usr/bin in $PATH
Run
	$ ./hugepages_settings.sh

三.修改系统配置

1./etc/sysctl.conf
vm.nr_hugepages = 770
2./etc/security/limits.conf
oracle soft memlock 102400
oracle hard memlock 102400
计算公式为:>=HugePages_Total*Hugepagesize

四.重启系统检查配置是否生效

[root@node1 ~]# cat /proc/meminfo | grep -i hugepages
HugePages_Total:   770
HugePages_Free:    770
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

五.启动数据库并验证

[oracle@node1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.5.0 - Production on Wed Jan 4 16:06:48 2012
Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 1610612736 bytes
Fixed Size                  2096632 bytes
Variable Size             486539784 bytes
Database Buffers         1107296256 bytes
Redo Buffers               14680064 bytes
Database mounted.
Database opened.
SQL> exit
[oracle@node1 ~]$ grep Huge /proc/meminfo
HugePages_Total:   770
HugePages_Free:    591
HugePages_Rsvd:    590
Hugepagesize:     2048 kB
[oracle@node1 ~]$ cat /proc/meminfo|grep PageT
PageTables:      13216 kB

HugePages_Free<>HugePages_Total,PageTables变小
证明HugePages配置成功

六.补充说明
1.在数据库内存服务器内存比较大的时候,可以考虑使用该功能,提高系统性能
2.对于Oracl 11g,如果需要使用此功能,需要关闭AMM功能(需要衡量)

ORA-01555 caused by SQL statement below

一.发现ORA-01555

Mon Dec 26 10:08:22 2011
ORA-01555 caused by SQL statement below (Query Duration=49146 sec, SCN: 0x0b4b.17f5ae42):
Mon Dec 26 10:08:22 2011
SELECT COMPANY_ID,
       COMPANY_MOBILE,
       TO_CHAR(NVL(REG_DATE, SYSDATE - 100), 'yyyymmddhh24miss'),
       BAK_FIELD2
  FROM TAB_XN_COMPANY
 WHERE (COMPANY_STATUS = 1 OR
       (COMPANY_STATUS = 3 AND
       NVL(UNREG_DATE, SYSDATE + 100) >=
       TO_DATE('20111226094500', 'yyyymmddhh24miss')))
   AND NVL(REG_DATE, SYSDATE - 100) <=
       TO_DATE('20111226095959', 'yyyymmddhh24miss')
   AND PAY_TYPE > 0

二.数据库状态

[oracle@ora02 ~]$ sqlplus "/ as sysdba"
SQL*Plus: Release 9.2.0.4.0 - Production on Wed Jan 4 10:48:17 2012
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
SQL> show parameter undo;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_management                      string      AUTO
undo_retention                       integer     10800
undo_suppress_errors                 boolean     FALSE
undo_tablespace                      string      UNDOTBS1
SQL> select sum(maxbytes)/1024/1024/1024,
   2 SUM(USER_BYTES)/1024/1024/1024 FROM dba_data_files where tablespace_NAME='UNDOTBS1';
SUM(MAXBYTES)/1024/1024/1024 SUM(USER_BYTES)/1024/1024/1024
---------------------------- ------------------------------
                  61.9999847                     32.6834106
SQL> SELECT DISTINCT STATUS "状态",
  2                  COUNT(*) "EXTENT数量",
  3                  SUM(BYTES) / 1024 / 1024 / 1024 "UNDO大小"
  4    FROM DBA_UNDO_EXTENTS
  5   GROUP BY STATUS;
状态      EXTENT数量   UNDO大小
--------- ---------- ----------
ACTIVE             1 .000976563
EXPIRED         2549 31.2333298
UNEXPIRED          3 .000175476

通过undo_retention保留时间为10800秒,而该sql执行了49146秒,在这49146秒钟,TAB_XN_COMPANY表中的数据被修改,而且被修改的undo数据在10800秒后被覆盖导致,导致原查询语句不能获取到scn小于或者等于查询时候的数据块内容(在undo中),所以出现ORA-01555。从这里也可以看出来,在undo空间还剩余的情况下,如果超过了undo_retention限制,undo内容还是有可能被覆盖,而不是使用未使用的undo

三.出现ORA-1555原因
The ORA-1555 errors can happen when a query is unable to access enough undo to build a copy of the data at the time the query started. Committed “versions” of blocks are maintained along with newer uncommitted “versions” of those blocks so that queries can access data as it existed in the database at the time of the query. These are referred to as “consistent read” blocks and are maintained using Oracle undo management.
就是一个查询要访问某个数据块,而这个数据块在这个查询执行过程中修改过,那么该查询需要查询undo中数据块,而undo中该数据块已经不存在,从而出现ORA-1555

四.ORA-1555解决方法
Case 1 – Rollback Overwritten
1.缩短sql运行时间
2.增加undo_retention,这个同时需要考虑undo空间大小
3.减少commit(rollback)次数
4.在一条sql中尽量使数据块访问一次
4.1)Using a full table scan rather than an index lookup
4.2)Introducing a dummy sort so that we retrieve all the data, sort it and then sequentially visit these data blocks.

Case 2 – Rollback Transaction Slot Overwritten
这种问题,主要是延迟块清理导致,一般建议在进行大批量的dml操作后,使用全表(全index)扫描执行一遍,或者收集全部统计信息

恢复被rm意外删除数据文件

一.模拟数据文件删除

[oracle@node1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Sat Dec 31 22:00:52 2011
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
--数据库版本
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
--所有数据文件
SQL> select name from v$datafile;
NAME
--------------------------------------------------------------------------------
/opt/oracle/oradata/ora11g/system01.dbf
/opt/oracle/oradata/ora11g/sysaux01.dbf
/opt/oracle/oradata/ora11g/undotbs01.dbf
/opt/oracle/oradata/ora11g/users01.dbf
/opt/oracle/oradata/ora11g/example01.dbf
--删除example01.dbf数据文件
SQL> !rm /opt/oracle/oradata/ora11g/example01.dbf
SQL> !ls -l /opt/oracle/oradata/ora11g/example01.dbf
ls: /opt/oracle/oradata/ora11g/example01.dbf: 没有那个文件或目录
--因为数据文件被删除,创建表失败
SQL> create table t_xifenfei tablespace example
  2  as select * from dba_tables;
as select * from dba_tables
                 *
ERROR at line 2:
ORA-01116: error in opening database file 5
ORA-01110: data file 5: '/opt/oracle/oradata/ora11g/example01.dbf'
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

二.找回数据文件

--查找dbw进程spid
[oracle@node1 ~]$ ps -ef|grep dbw|grep -v grep
oracle   18387     1  0 Dec22 ?        00:00:12 ora_dbw0_ora11g
--查看该进程所有文件句柄
[oracle@node1 ~]$ ll /proc/18387/fd
总计 0
lr-x------ 1 oracle oinstall 64 12-31 22:03 0 -> /dev/null
l-wx------ 1 oracle oinstall 64 12-31 22:03 1 -> /dev/null
lr-x------ 1 oracle oinstall 64 12-31 22:03 10 -> /dev/zero
lr-x------ 1 oracle oinstall 64 12-31 22:03 11 -> /dev/zero
lr-x------ 1 oracle oinstall 64 12-31 22:03 12 -> /opt/oracle/product/11.2.0/db_1/rdbms/mesg/orazhs.msb
lrwx------ 1 oracle oinstall 64 12-31 22:03 13 -> /opt/oracle/product/11.2.0/db_1/dbs/hc_ora11g.dat
lr-x------ 1 oracle oinstall 64 12-31 22:03 14 -> /proc/18387/fd
lr-x------ 1 oracle oinstall 64 12-31 22:03 15 -> /dev/zero
lr-x------ 1 oracle oinstall 64 12-31 22:03 16 -> /opt/oracle/product/11.2.0/db_1/rdbms/mesg/oraus.msb
lrwx------ 1 oracle oinstall 64 12-31 22:03 17 -> /opt/oracle/product/11.2.0/db_1/dbs/hc_ora11g.dat
lrwx------ 1 oracle oinstall 64 12-31 22:03 18 -> /opt/oracle/product/11.2.0/db_1/dbs/lkORA11G
lr-x------ 1 oracle oinstall 64 12-31 22:03 19 -> /opt/oracle/product/11.2.0/db_1/rdbms/mesg/orazhs.msb
l-wx------ 1 oracle oinstall 64 12-31 22:03 2 -> /dev/null
lr-x------ 1 oracle oinstall 64 12-31 22:03 20 -> /opt/oracle/product/11.2.0/db_1/rdbms/mesg/oraus.msb
lrwx------ 1 oracle oinstall 64 12-31 22:03 21 -> socket:[441562]
lrwx------ 1 oracle oinstall 64 12-31 22:03 256 -> /opt/oracle/oradata/ora11g/control01.ctl
lrwx------ 1 oracle oinstall 64 12-31 22:03 257 -> /opt/oracle/oradata/ora11g/system01.dbf
lrwx------ 1 oracle oinstall 64 12-31 22:03 258 -> /opt/oracle/oradata/ora11g/sysaux01.dbf
lrwx------ 1 oracle oinstall 64 12-31 22:03 259 -> /opt/oracle/oradata/ora11g/undotbs01.dbf
lrwx------ 1 oracle oinstall 64 12-31 22:03 260 -> /opt/oracle/oradata/ora11g/users01.dbf
lrwx------ 1 oracle oinstall 64 12-31 22:03 261 -> /opt/oracle/oradata/ora11g/example01.dbf (deleted)
lrwx------ 1 oracle oinstall 64 12-31 22:03 262 -> /opt/oracle/oradata/ora11g/temp01.dbf
lr-x------ 1 oracle oinstall 64 12-31 22:03 3 -> /dev/null
lr-x------ 1 oracle oinstall 64 12-31 22:03 4 -> /dev/null
lrwx------ 1 oracle oinstall 64 12-31 22:03 5 -> /opt/oracle/product/11.2.0/db_1/dbs/hc_ora11g.dat
lr-x------ 1 oracle oinstall 64 12-31 22:03 6 -> /dev/null
lr-x------ 1 oracle oinstall 64 12-31 22:03 7 -> /dev/null
lr-x------ 1 oracle oinstall 64 12-31 22:03 8 -> /dev/null
lr-x------ 1 oracle oinstall 64 12-31 22:03 9 -> /dev/null
--通过句柄恢复数据文件[被删除数据文件会被标示(deleted)]
[oracle@node1 ~]$ cp /proc/18387/fd/261 /opt/oracle/oradata/ora11g/example01.dbf
--确认该数据文件已经恢复成功
[oracle@node1 ~]$ ll /opt/oracle/oradata/ora11g/example01.dbf
-rw-r----- 1 oracle oinstall 362422272 12-31 22:05 /opt/oracle/oradata/ora11g/example01.dbf

三.数据文件online

SQL> alter database datafile 5 offline;
Database altered.
SQL> recover datafile 5;
Media recovery complete.
SQL> alter database datafile 5 online;
Database altered.
SQL> create table t_xifenfei tablespace example
  2  as select * from dba_tables;
Table created.

四.补充说明
在意外使用os命令删除掉数据文件时,千万不要慌张重启数据库或者操作系统,可以通过dbwn进程相关句柄找回数据文件

在UltraEdit中定位数据文件内容

一、定位数据块
1.bbed查看数据块

--第一个数据块
BBED> set block 1
        BLOCK#          1
BBED> dump count 16
 File: /opt/oracle/oradata/chf/example01.dbf (0)
 Block: 1                Offsets:    0 to   15           Dba:0x00000000
------------------------------------------------------------------------
 0ba20000 01004001 00000000 00000104
 <32 bytes per line>
BBED> dump count 128
 File: /opt/oracle/oradata/chf/example01.dbf (0)
 Block: 1                Offsets:    0 to  127           Dba:0x00000000
------------------------------------------------------------------------
 0ba20000 01004001 00000000 00000104 387a0000 00000000 0003200b 74684acd
 43484600 00000000 7d4a0000 00320000 00200000 05000300 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 d28a0900 00000000 1ab5d72c f7b4d72c c5320900 00000000 00000000
 <32 bytes per line>
--第二个数据块
BBED> set block 2
        BLOCK#          2
BBED> dump count 128
 File: /opt/oracle/oradata/chf/example01.dbf (0)
 Block: 2                Offsets:    0 to  127           Dba:0x00000000
------------------------------------------------------------------------
 1da20000 02004001 08dee400 00000204 f6b80000 05000000 08000000 00320000
 09000000 50000000 feff3f00 07000000 00320000 7f000000 fa010000 07dee400
 00000000 00000000 00000000 00000000 b1220000 08000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
--第100个数据块
BBED> set block 100
        BLOCK#          100
BBED> dump count 128
 File: /opt/oracle/oradata/chf/example01.dbf (0)
 Block: 100              Offsets:    0 to  127           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 64004001 343f0900 00000104 a38d0000 02002000 e1c90000 343f0900
 0000e81f 021f3200 61004001 00000000 00000000 00000000 00000000 00000000
 00000000 05000900 2c010000 52008000 c8005900 00800000 a3350900 00000000
 00000000 00008001 00000000 6b00fa00 5e1a6419 00000000 00000000 00000000
 <32 bytes per line>

2.UltraEdit查看数据块

--第0个数据块
--(为了便于和bbed的block一致,称为0比较合适,因为这个块在bbed中看不到)
00000000h: 00 A2 00 00 00 00 C0 FF 00 00 00 00 00 00 00 00 ; .?...?........
00000010h: 66 C8 00 00 00 20 00 00 00 32 00 00 7D 7C 7B 7A ; f?.. ...2..}|{z
00000020h: A0 81 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; 爜..............
00000030h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
……………………………………………………
00001fa0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00001fb0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00001fc0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00001fd0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00001fe0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
00001ff0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
--第一个数据块(bbed中的block 1)
00002000h: 0B A2 00 00 01 00 40 01 00 00 00 00 00 00 01 04 ; .?...@.........
00002010h: 38 7A 00 00 00 00 00 00 00 03 20 0B 74 68 4A CD ; 8z........ .thJ?
……………………………………………………………
000021c0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
000021d0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
--第二个数据块(bbed中的block 2)
00004000h: 1D A2 00 00 02 00 40 01 08 DE E4 00 00 00 02 04 ; .?...@..掬.....
00004010h: F6 B8 00 00 05 00 00 00 08 00 00 00 00 32 00 00 ; 龈...........2..
00004020h: 09 00 00 00 50 00 00 00 FE FF 3F 00 07 00 00 00 ; ....P...??.....
00004030h: 00 32 00 00 7F 00 00 00 FA 01 00 00 07 DE E4 00 ; .2.....?...掬.
00004040h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
--第100个数据块(bbed中的block 100)
000c8000h: 06 A2 00 00 64 00 40 01 34 3F 09 00 00 00 01 04 ; .?.d.@.4?......
000c8010h: A3 8D 00 00 02 00 20 00 E1 C9 00 00 34 3F 09 00 ; .... .嵘..4?..
000c8020h: 00 00 E8 1F 02 1F 32 00 61 00 40 01 00 00 00 00 ; ..?..2.a.@.....
000c8030h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
000c8040h: 00 00 00 00 05 00 09 00 2C 01 00 00 52 00 80 00 ; ........,...R.€.
000c8050h: C8 00 59 00 00 80 00 00 A3 35 09 00 00 00 00 00 ; ?Y..€..?......
000c8060h: 00 00 00 00 00 00 80 01 00 00 00 00 6B 00 FA 00 ; ......€.....k.?
000c8070h: 5E 1A 64 19 00 00 00 00 00 00 00 00 00 00 00 00 ; ^.d.............

3.结论:因为2000(16进制)=8192(10进制),而每行又是16,所以UltraEdit定位块的规则是
SELECT to_char(‘2*块数’,’xxxxxxx’) xff FROM dual;(xff*1000)

SQL> SELECT TO_number('2000','xxxxxx') FROM dual;
TO_NUMBER('2000','XXXXXX')
--------------------------
                      8192

二、UltraEdit定位块内内容
1.bbed查看内容

--第一块偏移量为484数据
BBED> set block 1
        BLOCK#          1
BBED> set offset 484
        OFFSET          484
BBED> dump count 128
 File: /opt/oracle/oradata/chf/example01.dbf (0)
 Block: 1                Offsets:  484 to  611           Dba:0x00000000
------------------------------------------------------------------------
 c052e700 00000000 ec82ba2d 01000000 2e060000 39020000 10008984 02000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
--第一块偏移量为8188数据
BBED> dump offset 8188 count 128
 File: /opt/oracle/oradata/chf/example01.dbf (0)
 Block: 1                Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 010b0000
 <32 bytes per line>

2.UltraEdit中内容

--offset 484(kscnbas)
000021e0h: 00 00 00 00 C0 52 E7 00 00 00 00 00 EC 82 BA 2D ; ....繰?....靷?
000021f0h: 01 00 00 00 2E 06 00 00 39 02 00 00 10 00 89 84 ; ........9.....墑
00002200h: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ; ................
--offset 8188(tailchk)
00003ff0h: 00 00 00 00 00 00 00 00 00 00 00 00 01 0B 00 00 ; ................

3.结论:10进制的便宜量转换为16进制,然后对照你块开始行号+偏移量(16进制),得到对应偏移量开始位置

SQL> SELECT to_char('484','xxxxxxx') FROM dual;
TO_CHAR('484','X
----------------
     1e4
SQL> SELECT to_char('8188','xxxxxxx') FROM dual;
TO_CHAR('8188','
----------------
    1ffc

三.补充说明
1.本实验是拷贝linux下11g数据文件到win上使用UltraEdit操作得出
2.使用UltraEdit,需要熟悉对一些关键数据的偏移量比较清楚

记一次含AND-EQUAL执行计划调优

1.数据库版本

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
PL/SQL Release 9.2.0.4.0 - Production
CORE    9.2.0.3.0       Production
TNS for Linux: Version 9.2.0.4.0 - Production
NLSRTL Version 9.2.0.4.0 - Production

2.发现含有AND-EQUAL执行计划

SQL> SELECT
  2  COUNT(TABXNPRESM1_.DETAIL_ID) AS X0_0_
  3    FROM QXTDEV1.TAB_XN_PRESMS TABXNPRESM0_, QXTDEV1.TAB_XN_PRESMS_DETAIL TABXNPRESM1_
  4  WHERE  (TABXNPRESM0_.COMPANY_ID = 346240)
  5     AND (TABXNPRESM0_.EMPLOYEE_ID = 0)
  6     AND (TABXNPRESM0_.PRE_TIME >= TO_DATE('2011/12/25', 'yyyy/mm/dd'))
  7     AND (TABXNPRESM0_.PRE_TIME < TO_DATE('2011/12/26', 'yyyy/mm/dd') + 1)
  8     AND ((TABXNPRESM0_.SEND_TYPE = 1) OR (TABXNPRESM0_.SEND_TYPE = 0))
  9     AND ((TABXNPRESM1_.RESULT = -1) OR (TABXNPRESM1_.RESULT = 0))
 10     AND (TABXNPRESM0_.PRE_ID = TABXNPRESM1_.PRE_ID);
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=11 Card=1 Bytes=40)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'TAB_XN_PRESMS_DETAIL'(Cost=3 Card=2 Bytes=30)
   3    2       NESTED LOOPS (Cost=11 Card=1 Bytes=40)
   4    3         TABLE ACCESS (BY INDEX ROWID) OF 'TAB_XN_PRESMS' (Cost=8 Card=1 Bytes=25)
   5    4           AND-EQUAL
   6    5             INDEX (RANGE SCAN) OF 'IDX_XN_PRESMS_2' (NON-UNIQUE)
   7    5             INDEX (RANGE SCAN) OF 'IDX_XN_PRESMS_1' (NON-UNIQUE) (Cost=4 Card=638)
   8    3         INDEX (RANGE SCAN) OF 'IDX_XN_PRESMS_DETAIL' (NON-UNIQUE) (Cost=2 Card=2)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     272188  consistent gets
          0  physical reads
          0  redo size
        375  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

3.关于AND-EQUAL解释
If the WHERE clauses uses columns of many single-column indexes, then Oracle executes the statement by performing a range scan on each index to retrieve the rowids of the rows that satisfy each condition. Oracle then merges the sets of rowids to obtain a set of rowids of rows that satisfy all conditions. Oracle then accesses the table using these rowids.
Oracle can merge up to five indexes. If the WHERE clause uses columns of more than five single-column indexes, then Oracle merges five of them, accesses the table by rowid, and then tests the resulting rows to determine whether they satisfy the remaining conditions before returning them.
大概的意思是当where条件后面含有多个列的单列索引时(不超过5个),会先得到每个条件的rowid,然后这些rowid进行merges,得到一个rowid的结果集,最后根据这些rowid取表中记录。

4.表/列/索引相关信息

--index和列信息
SQL> SELECT INDEX_NAME, TABLE_NAME, COLUMN_NAME
  2    FROM USER_IND_COLUMNS
  3   WHERE TABLE_NAME IN ('TAB_XN_PRESMS', 'TAB_XN_PRESMS_DETAIL');
INDEX_NAME                     TABLE_NAME                     COLUMN_NAME
------------------------------ ------------------------------ --------------------
IDX_XN_PRESMS_1                TAB_XN_PRESMS                  COMPANY_ID
IDX_XN_PRESMS_2                TAB_XN_PRESMS                  EMPLOYEE_ID
IDX_XN_PRESMS_3                TAB_XN_PRESMS                  PRE_TIME
PK_TAB_XN_PRESMS               TAB_XN_PRESMS                  PRE_ID
IDX_XN_PRESMS_4                TAB_XN_PRESMS                  SEND_TYPE
IDX_XN_PRESMS_DETAIL           TAB_XN_PRESMS_DETAIL           PRE_ID
IDX_XN_PRESMS_DETAIL_2         TAB_XN_PRESMS_DETAIL           SEND_TIME
PK_TAB_XN_PRESMS_DETAIL        TAB_XN_PRESMS_DETAIL           DETAIL_ID
8 rows selected
--index的统计信息
SQL> SELECT TABLE_NAME,
  2         INDEX_NAME,
  3         TO_CHAR(LAST_ANALYZED, 'yyyy-mm-dd hh24:mi:ss')
  4    FROM USER_INDEXES
  5   WHERE TABLE_NAME IN ('TAB_XN_PRESMS', 'TAB_XN_PRESMS_DETAIL');
TABLE_NAME                     INDEX_NAME                     TO_CHAR(LAST_ANALYZED,'YYYY-MM
------------------------------ ------------------------------ ------------------------------
TAB_XN_PRESMS                  IDX_XN_PRESMS_1                2011-12-29 09:25:32
TAB_XN_PRESMS                  IDX_XN_PRESMS_2                2011-12-29 09:25:35
TAB_XN_PRESMS                  IDX_XN_PRESMS_3                2011-12-29 09:25:39
TAB_XN_PRESMS                  IDX_XN_PRESMS_4                2011-12-29 09:25:21
TAB_XN_PRESMS_DETAIL           IDX_XN_PRESMS_DETAIL           2011-12-29 09:20:03
TAB_XN_PRESMS_DETAIL           IDX_XN_PRESMS_DETAIL_2         2011-12-29 09:20:01
TAB_XN_PRESMS                  PK_TAB_XN_PRESMS               2011-12-29 09:25:46
TAB_XN_PRESMS_DETAIL           PK_TAB_XN_PRESMS_DETAIL        2011-12-29 09:20:02
8 rows selected
--列的唯一度情况
SQL>  SELECT TABLE_NAME, COLUMN_NAME, NUM_DISTINCT
  2     FROM USER_TAB_COLS
  3    WHERE (TABLE_NAME, COLUMN_NAME) IN
  4          (SELECT TABLE_NAME, COLUMN_NAME
  5             FROM USER_IND_COLUMNS
  6            WHERE TABLE_NAME IN ('TAB_XN_PRESMS', 'TAB_XN_PRESMS_DETAIL'))
  7            ORDER BY table_name,NUM_DISTINCT DESC;
TABLE_NAME                     COLUMN_NAME                    NUM_DISTINCT
------------------------------ ------------------------------ ------------
TAB_XN_PRESMS                  PRE_ID                              1999270
TAB_XN_PRESMS                  PRE_TIME                            1342594
TAB_XN_PRESMS                  EMPLOYEE_ID                           10676
TAB_XN_PRESMS                  COMPANY_ID                             3136
TAB_XN_PRESMS                  SEND_TYPE                                10
TAB_XN_PRESMS_DETAIL           DETAIL_ID                           3863184
TAB_XN_PRESMS_DETAIL           PRE_ID                              1996872
TAB_XN_PRESMS_DETAIL           SEND_TIME                            437526
8 rows selected

通过这些信息可以得出:
1)统计信息是最新收集过的
2)因为有多个单列index,数据库为了使得cost最小,可能选择了不合适的index(IDX_XN_PRESMS_2[EMPLOYEE_ID]/IDX_XN_PRESMS_1[COMPANY_ID]),使得出现AND-EQUAL,从而逻辑读偏高。对于这个sql,应该使用唯一度比较高的IDX_XN_PRESMS_3[PRE_TIME]
3)也可以通过修改index,实现程序高效,但是考虑到会影响启动程序,在没有十足的把握之前遵守hint优先原则

5.增加hint提示

SQL> SELECT /*+ index(TABXNPRESM0_ IDX_XN_PRESMS_3) */
  2  COUNT(TABXNPRESM1_.DETAIL_ID) AS X0_0_
  3    FROM QXTDEV1.TAB_XN_PRESMS TABXNPRESM0_, QXTDEV1.TAB_XN_PRESMS_DETAIL TABXNPRESM1_
  4  WHERE  (TABXNPRESM0_.COMPANY_ID = 346240)
  5     AND (TABXNPRESM0_.EMPLOYEE_ID = 0)
  6     AND (TABXNPRESM0_.PRE_TIME >= TO_DATE('2011/12/25', 'yyyy/mm/dd'))
  7     AND (TABXNPRESM0_.PRE_TIME < TO_DATE('2011/12/26', 'yyyy/mm/dd') + 1)
  8     AND ((TABXNPRESM0_.SEND_TYPE = 1) OR (TABXNPRESM0_.SEND_TYPE = 0))
  9     AND ((TABXNPRESM1_.RESULT = -1) OR (TABXNPRESM1_.RESULT = 0))
 10     AND (TABXNPRESM0_.PRE_ID = TABXNPRESM1_.PRE_ID);
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=641 Card=1 Bytes=40)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'TAB_XN_PRESMS_DETAIL'(Cost=3 Card=2 Bytes=30)
   3    2       NESTED LOOPS (Cost=641 Card=1 Bytes=40)
   4    3         TABLE ACCESS (BY INDEX ROWID) OF 'TAB_XN_PRESMS' (Cost=638 Card=1 Bytes=25)
   5    4           INDEX (RANGE SCAN) OF 'IDX_XN_PRESMS_3' (NON-UNIQUE) (Cost=63 Card=22286)
   6    3         INDEX (RANGE SCAN) OF 'IDX_XN_PRESMS_DETAIL' (NON-UNIQUE) (Cost=2 Card=2)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1422  consistent gets
          0  physical reads
          0  redo size
        375  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

从这里可以看出,程序的逻辑读下降了很多(从272188下降到1422),得到了优化效果,提高了程序执行效率
结合上篇:BITMAP CONVERSION FROM ROWIDS,总结一个经验,如果同时使用到了一个表的多个index,效率一般情况下不会太高。同时也给各位提个醒,index并非越多越好,有时候会适得其反(建立index时需要考虑情况)

清理表部分数据方法

最近两天,开发要求,清理掉某些表的部分数据,因为不能停业务,不能采用cast+rman方式实现.只能自己写脚本删除,在这些删除数据中,经过总结,共有三种类型:
1.删除单个表数据
删除a表以dealdate为范围的部分数据

DECLARE
  P_SQL   VARCHAR2(300) :=
  'DELETE FROM a WHERE dealdate<to_date('||''''||
'2010-11-01 00:00:00'||''''||','||''''||'yyyy-mm-dd hh24:mi:ss'||''''||')';
  P_COUNT NUMBER := 1000;
BEGIN
  WHILE 1 = 1 LOOP
    EXECUTE IMMEDIATE P_SQL || ' and rownum <= :nu'
      USING P_COUNT;
    IF SQL%NOTFOUND THEN
      EXIT;
    END IF;
    COMMIT;
  END LOOP;
  COMMIT;
END;

2.删除两个关联表数据
tab_a,tab_b两个表通过共有的PRE_ID列关联,然后按照tab_a.ACCEPT_TIME列为条件删除两个表中数据,tab_b表中数据比tab_a多很多

DECLARE
  CURSOR CUR IS
    SELECT B.ROWID BID, A.ROWID AID
      FROM tab_a A, tab_b B
     WHERE A.PRE_ID = B.PRE_ID
       AND A.ACCEPT_TIME <
           TO_DATE('2010-11-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss')
     ORDER BY B.ROWID;
  V_COUNTER NUMBER;
BEGIN
  V_COUNTER := 0;
  FOR ROW IN CUR LOOP
    DELETE FROM tab_a WHERE ROWID = ROW.AID;
    DELETE FROM tab_b WHERE ROWID = ROW.BID;
    V_COUNTER := V_COUNTER + 1;
    IF (V_COUNTER >= 1000) THEN
      COMMIT;
      V_COUNTER := 0;
    END IF;
  END LOOP;
  COMMIT;
END;

3.删除某个表中重复列
删除tab_a表中的COMPANY_ID/PY_DES/PY_DES_Q/PY_TYPE/RELATE_ID列重复数据

DECLARE
  CURSOR CUR IS
    SELECT A.ROWID AID
      FROM tab_a A
     WHERE ROWID NOT IN (SELECT MAX(B.ROWID)
                           FROM tab_a B
                          WHERE A.COMPANY_ID = B.COMPANY_ID
                            AND A.PY_DES = B.PY_DES
                            AND A.PY_DES_Q = B.PY_DES_Q
                            AND A.PY_TYPE = B.PY_TYPE
                            AND A.RELATE_ID = B.RELATE_ID)
     ORDER BY A.ROWID;
  V_COUNTER NUMBER;
BEGIN
  V_COUNTER := 0;
  FOR ROW IN CUR LOOP
    DELETE FROM tab_a WHERE ROWID = ROW.AID;
    V_COUNTER := V_COUNTER + 1;
    IF (V_COUNTER >= 1000) THEN
      COMMIT;
      V_COUNTER := 0;
    END IF;
  END LOOP;
  COMMIT;
END;