记录一次rman备份ORA-19502/ORA-27063错误原因分析

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:记录一次rman备份ORA-19502/ORA-27063错误原因分析

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

rman备份出现ORA-19502/ORA-27063错误

RMAN> 2> 3> 4> 5> 6> 7> 8> 9> 10> 11> 12> 13> 14> 15> 16> 17> 18> 19> 20>
allocated channel: t11
channel t11: sid=824 instance=ncdb1 devtype=DISK
allocated channel: t12
channel t12: sid=838 instance=ncdb1 devtype=DISK
allocated channel: t13
channel t13: sid=809 instance=ncdb1 devtype=DISK
allocated channel: t14
channel t14: sid=886 instance=ncdb1 devtype=DISK
allocated channel: t15
channel t15: sid=620 instance=ncdb1 devtype=DISK
allocated channel: t16
channel t16: sid=599 instance=ncdb1 devtype=DISK
allocated channel: t17
channel t17: sid=482 instance=ncdb1 devtype=DISK
allocated channel: t18
channel t18: sid=506 instance=ncdb1 devtype=DISK
一共开通8个通道
channel t12: starting full datafile backupset
channel t12: specifying datafile(s) in backupset
input datafile fno=00008 name=/dev/rnc32g_39
input datafile fno=00016 name=/dev/rnc32g_47
input datafile fno=00024 name=/dev/rnc32g_57
input datafile fno=00032 name=/dev/rnc32g_25
input datafile fno=00040 name=/dev/rnc32g_33
input datafile fno=00048 name=/dev/rnc32g_3
input datafile fno=00056 name=/dev/rnc32g_11
input datafile fno=00064 name=/dev/rnc32g_19
input datafile fno=00072 name=/dev/rnc32g_67
input datafile fno=00080 name=/dev/rnc32g_106
input datafile fno=00088 name=/dev/rnc32g_114
input datafile fno=00096 name=/dev/rnc32g_87
input datafile fno=00104 name=/dev/rnc32g_95
input datafile fno=00112 name=/dev/rnc32g_103
input datafile fno=00120 name=/dev/rnc32g_75
input datafile fno=00003 name=/dev/rnc50_sysaux
input datafile fno=00130 name=/dev/rnc32g_119
channel t12: starting piece 1 at 14-MAY-12
--通道12备份数据文件
channel t17: starting full datafile backupset
channel t17: specifying datafile(s) in backupset
input datafile fno=00002 name=/dev/rnc32g_22
input datafile fno=00013 name=/dev/rnc32g_44
input datafile fno=00021 name=/dev/rnc32g_54
input datafile fno=00029 name=/dev/rnc32g_62
input datafile fno=00037 name=/dev/rnc32g_30
input datafile fno=00045 name=/dev/rnc32g_38
input datafile fno=00053 name=/dev/rnc32g_8
input datafile fno=00061 name=/dev/rnc32g_16
input datafile fno=00069 name=/dev/rnc32g_64
input datafile fno=00077 name=/dev/rncundo_33g_4
input datafile fno=00085 name=/dev/rnc32g_111
input datafile fno=00093 name=/dev/rnc32g_84
input datafile fno=00101 name=/dev/rnc32g_92
input datafile fno=00109 name=/dev/rnc32g_100
input datafile fno=00117 name=/dev/rnc32g_72
input datafile fno=00006 name=/dev/rnc50_4g_1
channel t17: starting piece 1 at 14-MAY-12
--通道17备份数据文件
channel t15: finished piece 1 at 15-MAY-12
piece handle=/rman/db_mpnb04jl_1_1 tag=TAG20120514T204954 comment=NONE
channel t15: backup set complete, elapsed time: 06:07:59
channel t11: finished piece 1 at 15-MAY-12
piece handle=/rman/db_mlnb04jk_1_1 tag=TAG20120514T204954 comment=NONE
channel t11: backup set complete, elapsed time: 06:17:25
channel t16: finished piece 1 at 15-MAY-12
piece handle=/rman/db_mqnb04jm_1_1 tag=TAG20120514T204954 comment=NONE
channel t16: backup set complete, elapsed time: 06:34:49
channel t14: finished piece 1 at 15-MAY-12
piece handle=/rman/db_monb04jl_1_1 tag=TAG20120514T204954 comment=NONE
channel t14: backup set complete, elapsed time: 06:40:05
channel t18: finished piece 1 at 15-MAY-12
piece handle=/rman/db_msnb04jn_1_1 tag=TAG20120514T204954 comment=NONE
channel t18: backup set complete, elapsed time: 06:43:38
channel t13: finished piece 1 at 15-MAY-12
piece handle=/rman/db_mnnb04jl_1_1 tag=TAG20120514T204954 comment=NONE
channel t13: backup set complete, elapsed time: 07:40:56
--这里可以看出rman的备份完成了通道11/13/14/15/16/18,也就是说目前为止通道12/17未完成.
RMAN-03009: failure of backup command on t12 channel at 05/15/2012 04:39:58
ORA-19502: write error on file "/rman/db_mmnb04jl_1_1", blockno 30481025 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576
ORA-19502: write error on file "/rman/db_mmnb04jl_1_1", blockno 30480897 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
channel t12 disabled, job failed on it will be run on another channel
--通道12报错(硬盘空间不足)
channel t11: starting full datafile backupset
channel t11: specifying datafile(s) in backupset
input datafile fno=00008 name=/dev/rnc32g_39
input datafile fno=00016 name=/dev/rnc32g_47
input datafile fno=00024 name=/dev/rnc32g_57
input datafile fno=00032 name=/dev/rnc32g_25
input datafile fno=00040 name=/dev/rnc32g_33
input datafile fno=00048 name=/dev/rnc32g_3
input datafile fno=00056 name=/dev/rnc32g_11
input datafile fno=00064 name=/dev/rnc32g_19
input datafile fno=00072 name=/dev/rnc32g_67
input datafile fno=00080 name=/dev/rnc32g_106
input datafile fno=00088 name=/dev/rnc32g_114
input datafile fno=00096 name=/dev/rnc32g_87
input datafile fno=00104 name=/dev/rnc32g_95
input datafile fno=00112 name=/dev/rnc32g_103
input datafile fno=00120 name=/dev/rnc32g_75
input datafile fno=00003 name=/dev/rnc50_sysaux
input datafile fno=00130 name=/dev/rnc32g_119
channel t11: starting piece 1 at 15-MAY-12
--在通道12报错后,通道11已经完成了上次备份,所以启动备份通道12出错的数据文件
RMAN-03009: failure of backup command on t17 channel at 05/15/2012 04:39:58
ORA-19502: write error on file "/rman/db_mrnb04jm_1_1", blockno 30753793 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576
ORA-19502: write error on file "/rman/db_mrnb04jm_1_1", blockno 30753665 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
channel t17 disabled, job failed on it will be run on another channel
--通道17也因为磁盘空间报错
channel t13: starting full datafile backupset
channel t13: specifying datafile(s) in backupset
input datafile fno=00002 name=/dev/rnc32g_22
input datafile fno=00013 name=/dev/rnc32g_44
input datafile fno=00021 name=/dev/rnc32g_54
input datafile fno=00029 name=/dev/rnc32g_62
input datafile fno=00037 name=/dev/rnc32g_30
input datafile fno=00045 name=/dev/rnc32g_38
input datafile fno=00053 name=/dev/rnc32g_8
input datafile fno=00061 name=/dev/rnc32g_16
input datafile fno=00069 name=/dev/rnc32g_64
input datafile fno=00077 name=/dev/rncundo_33g_4
input datafile fno=00085 name=/dev/rnc32g_111
input datafile fno=00093 name=/dev/rnc32g_84
input datafile fno=00101 name=/dev/rnc32g_92
input datafile fno=00109 name=/dev/rnc32g_100
input datafile fno=00117 name=/dev/rnc32g_72
input datafile fno=00006 name=/dev/rnc50_4g_1
channel t13: starting piece 1 at 15-MAY-12
--通道13也尝试备份通道17失败的数据文件
RMAN-03009: failure of backup command on t11 channel at 05/15/2012 04:39:59
ORA-19504: failed to create file "/rman/db_mtnb104u_1_1"
ORA-27044: unable to write the header block of file
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: 3
Addition
--因为当前没有空闲空间,通道11终止,
--这个时候rman异常终止,导致后续的通道13终止记录未打印到日志

阅读完rman日志,很好理解因为存放rman备份的磁盘空间不足导致了一系列错误

检查磁盘剩余空间

Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
/dev/hd4          10.00      9.75    3%     6548     1% /
/dev/hd2          10.00      4.55   55%    84383     8% /usr
/dev/hd9var        5.00      4.04   20%     6290     1% /var
/dev/hd3           5.00      3.87   23%     1551     1% /tmp
/dev/hd1          10.00      9.91    1%      382     1% /home
/proc                 -         -    -         -     -  /proc
/dev/hd10opt       5.00      4.89    3%     3502     1% /opt
/dev/archalv      99.00     82.98   17%       96     1% /archa
/dev/fslv01       40.00     19.49   52%    72324     2% /ora10
/dev/fslv00     1800.00    467.25   75%       10     1% /rman

这下让人迷糊了,磁盘空间还剩余467.25G,怎么会报错呢?

分析原因

RMAN-03009: failure of backup command on t12 channel at 05/15/2012 04:39:58
ORA-19502: write error on file "/rman/db_mmnb04jl_1_1", blockno 30481025 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576
ORA-19502: write error on file "/rman/db_mmnb04jl_1_1", blockno 30480897 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
channel t12 disabled, job failed on it will be run on another channel
RMAN-03009: failure of backup command on t17 channel at 05/15/2012 04:39:58
ORA-19502: write error on file "/rman/db_mrnb04jm_1_1", blockno 30753793 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576
ORA-19502: write error on file "/rman/db_mrnb04jm_1_1", blockno 30753665 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
channel t17 disabled, job failed on it will be run on another channel

这两个通道在写入rman备份到磁盘中的时候,在05/15/2012 04:39:58发现磁盘空间不足,两个通道分别准备写入30480897/30753665号块的时候出错,那么当时这两个通道分别写入的数据块数为30480896/30753664,写入文件大小为(30480896+30753664)*8192/1024/1024/1024=467.1826171875G.这里可以看出磁盘剩余空间467.25G,其实当时已经写入了467.1826171875G,继续写入的时候出错.然后rman为了保证备份的正确性,自动删除了当时已经备份的467.1826171875G错误的备份文件.从而在备份结束后看到磁盘空间还有大量剩余而rman包空间不足的现象.

ORA-07445 [ACCESS_VIOLATION] [UNABLE_TO_READ] []

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-07445 [ACCESS_VIOLATION] [UNABLE_TO_READ] []

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

alert中发现ORA-07445错误
ORA-07445: exception encountered: core dump [PC:0x7FFF65D0] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFF] [PC:0x7FFF65D0] [UNABLE_TO_READ] []错误,导致数据库down掉

Mon May 14 14:34:34 2012
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFF] [PC:0x7FFF65D0, {empty}]
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_p001_1280.trc:
ORA-07445: exception encountered: core dump [PC:0x7FFF65D0] [ACCESS_VIOLATION]
[ADDR:0xFFFFFFFF] [PC:0x7FFF65D0] [UNABLE_TO_READ] []
Mon May 14 14:34:35 2012
Trace dumping is performing id=[cdmp_20120514143435]
Mon May 14 14:35:10 2012
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFF] [PC:0x7FFF65D0, {empty}]
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_1072.trc  (incident=164712):
ORA-07445: exception encountered: core dump [PC:0x7FFF65D0] [ACCESS_VIOLATION]
[ADDR:0xFFFFFFFF] [PC:0x7FFF65D0] [UNABLE_TO_READ] []
ORA-12080: Buffer cache miss for IOQ batching
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_164712\orcl_smon_1072_i164712.trc

分析trace文件

Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Windows Server 2003 Version V5.2 Service Pack 2
CPU                 : 8 - type 586, 4 Physical Cores
Process Affinity    : 0x00000000
Memory (Avail/Total): Ph:4892M/8189M, Ph+PgF:5638M/9795M, VA:925M/4095M
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 12
Windows thread id: 1072, image: ORACLE.EXE (SMON)
--以上信息得出操作系统和数据库版本2003 sp2+oracle11g(11.1.0.6 32位)
Dump continued from file: d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_1072.trc
ORA-07445: exception encountered: core dump [PC:0x7FFF65D0] [ACCESS_VIOLATION]
[ADDR:0xFFFFFFFF] [PC:0x7FFF65D0] [UNABLE_TO_READ] []
ORA-12080: Buffer cache miss for IOQ batching
========= Dump for incident 164712 (ORA 7445 [PC:0x7FFF65D0]) ========
----- Beginning of Customized Incident Dump(s) -----
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFF] [PC:0x7FFF65D0, {empty}]
--这里的ORA-07445 [ACCESS_VIOLATION][UNABLE_TO_READ]根据经验结合这里的32位的环境,
--怀疑是sga使用的内存太多,ORACLE数据库不能读SGA相关内存导致
--在trace中找出相关参数配置.
[0004]: processes=300
[0004]: sessions=335
[0004]: __shared_pool_size=1124073472
[0004]: __large_pool_size=8388608
[0004]: __java_pool_size=16777216
[0004]: __streams_pool_size=251658240
[0004]: streams_pool_size=251658240
[0004]: sga_target=0
[0004]: __sga_target=1887436800
[0004]: memory_target=3145728000
[0004]: memory_max_target=4722786304
[0004]: db_block_size=8192
[0004]: __db_cache_size=478150656
[0004]: __shared_io_pool_size=0
[0004]: compatible=11.1.0.0.0
[0004]: log_buffer=8851456
[0004]: __pga_aggregate_target=780140544
--这里可以看到sga_target分配了内存为1887436800=1.7578125G
--pga_aggregate_target分配了780140544=0.7265625G
--两者内存之和大于2G,超过了32位ORACLE默认限制

查询MOS发现[1341681.1]
该错误原因

This is a resource issue (memory in particular). 32-bit windows systems,
are limited to 2GB of addressable memory so if you are on this platform
it's likely you are simply exceeding the capabilities of the 32bit operating system.

解决建议

First recommendation :
If you have not already done so, add the /3GB switch to your boot.ini file and reboot the server. The
boot.ini will be located in the root directory on the drive where windows is installed. The switch, /3GB,
is placed at the end of the line that executes the WinNT loading process.
This will allow applications such as oracle access to 3Gb or memory instead of 2Gb.
Example:
[operating systems] multi(0)disk(0)rdisk(0)partition(2)\WINNT="Windows NT Server Version 4.00" /3GB
 Second recommendation :
You do not want to increase memory target. If anything, this should be decreased.
You are limited to under 2GB of addressable memory on 32bit windows (the limit is actually about 1.85GB).
This is for both SGA and PGA memory for all instances; you have to reduce the SGA size for the instance.
The recommendation is to reduce sga_target, memory_target, and memory_max_target.

9i库遇到ORA-01595/ORA-01594

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:9i库遇到ORA-01595/ORA-01594

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在alert日志中发现ORA-01595/ORA-01594错误

Sat May 12 21:54:17 2012
Errors in file /oracle/app/admin/prmdb/bdump/prmdb2_smon_483522.trc:
ORA-01595: error freeing extent (2) of rollback segment (19))
ORA-01594: attempt to wrap into rollback segment (19) extent (2) which is being freed

分析trace文件

Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production
ORACLE_HOME = /oracle/app/product/9.2.0
System name:    AIX
Node name:      prmsvr2
Release:        3
Version:        5
Machine:        0008585FD600
Instance name: prmdb2
Redo thread mounted by this instance: 2
Oracle process number: 14
Unix process pid: 483522, image: oracle@prmsvr2 (SMON)
*** 2011-05-03 15:28:47.858
*** SESSION ID:(17.1) 2011-05-03 15:28:47.843
*** 2011-05-03 15:28:47.858
SMON: Parallel transaction recovery tried
*** 2011-07-11 17:13:52.028
SMON: Restarting fast_start parallel rollback
*** 2011-07-11 17:28:39.705
SMON: Parallel transaction recovery tried
*** 2012-05-12 21:54:17.246   --当前问题时间点
SMON: following errors trapped and ignored:
ORA-01595: error freeing extent (2) of rollback segment (19))
ORA-01594: attempt to wrap into rollback segment (19) extent (2) which is being freed
--通过trace文件,我们没有获得关于该错误的其他有用信息

查询MOS相关信息[280151.1]
出现该错误原因

This is a known problem and there is an Internal Bug:2181139 for this Issue.
The error is signaled because smon is shrinking a rollback segment and this fails
because we need an extent to store some rollback information. This is a failure message
for the shrinking. Subsequently smon would succeed in doing that.
--当smon在shrink rollback segment因为需要一个extent存放rollback

解决建议

Ignore these error messages.
Normally adding more undo space should solve the problem,
but if space is not correcting the problem, please file an SR for assistance.
This error message logging is fixed in 10g.
--忽略该错误或者升级到10g

ORA-07445[dbgrmqmqpk_query_pick_key()+0f88]

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-07445[dbgrmqmqpk_query_pick_key()+0f88]

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

alert发现如下错误ORA-07445[dbgrmqmqpk_query_pick_key()+0f88]

Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xB38F0000000049]
[PC:0x100213C08, dbgrmqmqpk_query_pick_key()+0f88]
Errors in file /oracle/diag/rdbms/sgerp5/sgerp5/trace/sgerp5_m000_7602504.trc  (incident=579300):
ORA-07445: exception encountered: core dump [dbgrmqmqpk_query_pick_key()+0f88] [SIGSEGV] [ADDR:0xB38F0000000049]
[PC:0x100213C08] [Address not mapped to object] []
Incident details in: /oracle/diag/rdbms/sgerp5/sgerp5/incident/incdir_579300/sgerp5_m000_7602504_i579300.trc

trace文件部分信息

Dump file /oracle/diag/rdbms/sgerp5/sgerp5/incident/incdir_579300/sgerp5_m000_7602504_i579300.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /oracle/product/11.1.0/db_1
System name:	AIX
Node name:	sgerp5
Release:	1
Version:	6
Machine:	00C8F0564C00
Instance name: sgerp5
Redo thread mounted by this instance: 1
Oracle process number: 138
Unix process pid: 7602504, image: oracle@sgerp5 (m000)
--确定是m000进程出现异常,而该进程是awr收集统计信息进程MMON的子进程
*** 2012-05-11 03:52:35.200
*** SESSION ID:(752.5029) 2012-05-11 03:52:35.200
*** CLIENT ID:() 2012-05-11 03:52:35.200
*** SERVICE NAME:(SYS$BACKGROUND) 2012-05-11 03:52:35.200
*** MODULE NAME:(MMON_SLAVE) 2012-05-11 03:52:35.200
*** ACTION NAME:(Auto-Purge Slave Action) 2012-05-11 03:52:35.200
Dump continued from file: /oracle/diag/rdbms/sgerp5/sgerp5/trace/sgerp5_m000_7602504.trc
ORA-07445: exception encountered: core dump [dbgrmqmqpk_query_pick_key()+0f88] [SIGSEGV] [ADDR:0xB38F0000000049]
[PC:0x100213C08] [Address not mapped to object] []
----- Incident Context Dump -----
Address: 0x1104bdb68
Incident ID: 579300
Problem Key: ORA 7445 [dbgrmqmqpk_query_pick_key()+0f88]
Error: ORA-7445 [dbgrmqmqpk_query_pick_key()+0f88] [SIGSEGV] [ADDR:0xB38F0000000049] [PC:0x100213C08]
[Address not mapped to object] [] [] []
[00]: dbgexExplicitEndInc [diag_dde]
[01]: dbgeEndDDEInvocationImpl [diag_dde]
[02]: dbgeEndDDEInvocation [diag_dde]
[03]: ssexhd []
[04]: 47dc []<-- Signaling
[05]: dbgrmqmfs_fetch_setup [ams_comp]
[06]: dbgrmqmf_fetch_real [ams_comp]
[07]: dbgrmqmf_fetch [ams_comp]
[08]: dbgrip_fetch_record [ami_comp]
[09]: dbgrip_relation_iterator [ami_comp]
[10]: dbgripricm_rltniter_wcbf_mt [ami_comp]
[11]: dbgripdrm_dmldrv_mt [ami_comp]
[12]: dbghmm_delete_info_records []
[13]: dbghmo_purge_hm_schema []
[14]: dbgrupipscb_hm_pgsvc_cbf [diag_adr]
[15]: dbgruppm_purge_main [diag_adr]
[16]: dbkrapg_auto_purge [rdbms_adr]
[17]: kewraps_auto_purge_slave []
[18]: kebm_slave_main []
[19]: ksvrdp [ksv_trace]
[20]: opirip []
[21]: opidrv []
[22]: sou2o []
[23]: opimai_real []
[24]: main []
[25]: __start []
MD [00]: 'SID'='752.5029' (0x3)
MD [01]: 'ProcId'='138.46' (0x3)
MD [02]: 'PQ'='(0, 1336679546)' (0x7)
MD [03]: 'Client ProcId'='oracle@sgerp5.7602504_1' (0x0)

MOS关于该问题记录
问题原因

This is due to Bug 9390347 fixed in 12.1 & 11.2.0.2, where a core dump can occur
in module dbgrmqmqpk_query_pick_key() whilst purging HM contents from ADR.

解决方案

- Either install our 11.2.0.2 patchset
- Or download and apply Patch 9390347 if available for your version/platform.
- On Windows, you can also install Bundle Patch 11.1.0.7.31 or above.
There is no workaround to this error, however the error is not serious and does not cause any harm to your database.

ORA-00001 Unique Constraint SYS.I_JOB_JOB Violated

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-00001 Unique Constraint SYS.I_JOB_JOB Violated

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

IMPDP导入数据发现ORA-00001 Unique Constraint SYS.I_JOB_JOB Violated错误

ORA-39083: Object type JOB failed to create with error:
ORA-00001: unique constraint (SYS.I_JOB_JOB) violated
Failing sql is:
 BEGIN DBMS_JOB.ISUBMIT( JOB=> 63, NEXT_DATE=> TO_DATE('2012-04-27 00:00:00',
'YYYY-MM-DD:HH24:MI:SS'), INTERVAL=> 'TRUNC(SYSDATE+1)', WHAT=> 'GBEAS1.UPDATE_EMP_INFO;',
NO_PARSE=> TRUE); END;
Job "GBEAS3"."SYS_IMPORT_FULL_01" completed with 8 error(s) at 16:05:58

错误原因(该job=63已经存在数据库中)

select job, what from   dba_jobs where job=63;
JOB     WHAT
-----   --------
63      proc_xifenfei

注意:如果该job正在运行,可能需要查询DBA_JOBS_RUNNING

解决办法

1.手工创建job,指定一个不存在的job 号
declare
  m_job number;
begin
  select max (job) + 1
  into   m_job
  from   dba_jobs;
BEGIN DBMS_JOB.ISUBMIT( JOB=> m_job, NEXT_DATE=> TO_DATE('2012-04-27 00:00:00',
'YYYY-MM-DD:HH24:MI:SS'), INTERVAL=> 'TRUNC(SYSDATE+1)', WHAT=> 'GBEAS1.UPDATE_EMP_INFO;',
NO_PARSE=> TRUE); END;
end;
/
2.删除原存在job
exec dbms_job.remove (63);

这样的情况,一般发生在expdp导出数据包含了job(如:全库导出,用户导出),然后导入到目标库,而该job号已经存在导致

OER 7451 in Load Indicator : Error Code = OSD-04500:指定了非法选项

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:OER 7451 in Load Indicator : Error Code = OSD-04500:指定了非法选项

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

alert 日志错误
OER 7451 in Load Indicator : Error Code = OSD-04500:指定了非法选项

Sun Apr 22 11:15:51 2012
OER 7451 in Load Indicator : Error Code = OSD-04500: 指定了非法选项
O/S-Error: (OS 1) 函数不正确。 !
OER 7451 in Load Indicator : Error Code = OSD-04500: 指定了非法选项
O/S-Error: (OS 1) 函数不正确。 !
Sun Apr 22 11:16:01 2012
OER 7451 in Load Indicator : Error Code = OSD-04500: 指定了非法选项
O/S-Error: (OS 1) 函数不正确。 !
OER 7451 in Load Indicator : Error Code = OSD-04500: 指定了非法选项
O/S-Error: (OS 1) 函数不正确。 !
Sun Apr 22 11:16:11 2012
OER 7451 in Load Indicator : Error Code = OSD-04500: 指定了非法选项
O/S-Error: (OS 1) 函数不正确。 !

错误信息说明

07451, 00000, "slskstat: unable to obtain load information."
// *Cause:  kstat library returned an error. Possible OS failure
// *Action: Check result code in sercose[0] for more information.

数据库版本信息

SQL> select * from v$version;
BANNER
------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production  <<== 32位数据库
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for 32-bit Windows: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

操作系统信息

C:\Users\XIFENFEI>systeminfo
主机名:           XIFENFEI-PC
OS 名称:          Microsoft Windows 7 旗舰版
OS 版本:          6.1.7601 Service Pack 1 Build 7601
OS 制造商:        Microsoft Corporation
OS 配置:          独立工作站
OS 构件类型:      Multiprocessor Free
注册的所有人:     XIFENFEI
注册的组织:       Microsoft
产品 ID:          00426-068-8452196-86428
初始安装日期:     2012/2/28, 20:37:08
系统启动时间:     2012/4/22, 9:16:07
系统制造商:       Dell Inc.
系统型号:         Inspiron N4050
系统类型:         x64-based PC       <<==操心系统是win 7 64位
处理器:           安装了 1 个处理器。
                  [01]: Intel64 Family 6 Model 42 Stepping 7 GenuineIntel ~2300 Mhz
BIOS 版本:        Dell Inc. A06, 2011/11/14
Windows 目录:     C:\Windows
系统目录:         C:\Windows\system32
启动设备:         \Device\HarddiskVolume1
系统区域设置:     zh-cn;中文(中国)
输入法区域设置:   zh-cn;中文(中国)
时区:             (UTC+08:00)北京,重庆,香港特别行政区,乌鲁木齐
物理内存总量:     8,100 MB
可用的物理内存:   5,196 MB
虚拟内存: 最大值: 9,122 MB
虚拟内存: 可用:   5,315 MB
虚拟内存: 使用中: 3,807 MB
页面文件位置:     D:\pagefile.sys
域:               WORKGROUP
登录服务器:       \\XIFENFEI-PC

错误原因

Installed 32-bit Oracle database software on a 64-bit MS Windows OS which is not supported.
Note: For the Database software, you can ONLY install the x64 version on MS Windows (x64).
          You can NOT install the 32-bit version Database software on MS Windows (x64).

解决办法

Install 32-bit Oracle database software only on 32-bit MS Windows OS.

ORA-00600[KSSADP1]

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-00600[KSSADP1]

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

检查数据库发现ORA-00600[KSSADP1]错误

Thu Apr 19 21:16:45 2012
Errors in file /oracle9/app/admin/crm/udump/crm1_ora_442896.trc:
ORA-00600: internal error code, arguments: [KSSADP1], [], [], [], [], [], [], []
Thu Apr 19 21:16:45 2012
Errors in file /oracle9/app/admin/crm/udump/crm1_ora_442896.trc:
ORA-00600: internal error code, arguments: [KSSADP1], [], [], [], [], [], [], []
Thu Apr 19 21:16:45 2012
Trace dumping is performing id=[cdmp_20120419211645]
Thu Apr 19 21:16:46 2012
Errors in file /oracle9/app/admin/crm/udump/crm1_ora_442896.trc:
ORA-00600: internal error code, arguments: [KSSADP1], [], [], [], [], [], [], []
Thu Apr 19 21:16:47 2012
Errors in file /oracle9/app/admin/crm/udump/crm1_ora_442896.trc:
ORA-00600: internal error code, arguments: [KSSADP1], [], [], [], [], [], [], []

分析crm1_ora_442896.trc信息

Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production
ORACLE_HOME = /oracle9/app/product/9.2.0
System name:    AIX
Node name:      zwq_crm1
Release:        3
Version:        5
Machine:        00C420B44C00
Instance name: crm1
Redo thread mounted by this instance: 1
Oracle process number: 2354
Unix process pid: 442896, image: oracle@zwq_crm1 (TNS V1-V3)
*** SESSION ID:(927.39278) 2012-04-19 21:16:45.317
*** 2012-04-19 21:16:45.317
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [KSSADP1], [], [], [], [], [], [], []
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp+0148          bl       ksedst               1029746FC ?
ksfdmp+0018          bl       01FD4014
kgerinv+00e8         bl       _ptrgl
kgesinv+0020         bl       kgerinv              9001000A02B56F8 ?
                                                   9001000A02B9450 ?
                                                   FFFFFFFFFFF8430 ? 000000458 ?
                                                   900000000CBAFA4 ?
ksesin+005c          bl       kgesinv              FFFFFFFFFFF88E0 ? 1101FAF78 ?
                                                   900000000C0ECC0 ? 000010000 ?
                                                   000000002 ?
kssadpm_stage+00c4   bl       ksesin               102973C84 ? 000000000 ?
                                                   00000001E ? 000000000 ?
                                                   000000069 ? 00000000C ?
                                                   000000000 ? 000000000 ?
ksqgel+0138          bl       kssadpm_stage        000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
kcftis+003c          bl       ksqgel               000000000 ? 4029C61E0 ?
                                                   000000002 ? 0FFFFC16C ?
                                                   102A7977C ? 000000000 ?
                                                   000000003 ? 002A36408 ?
kcfhis+001c          bl       kcftis
krbbcc+0238          bl       kcfhis               11043B590 ?
krbpgc+001c          bl       krbbcc
ksmupg+0074          bl       _ptrgl
ksuded+00b8          bl       ksmupg               102924988 ? 000000020 ?
ksupucg+10ec         bl       ksuded               700000C376F5740 ? 000000000 ?
                                                   000000000 ?
opiodr+0474          bl       ksupucg              100000001 ?
ttcpip+0cc4          bl       _ptrgl
opitsk+0d60          bl       ttcpip               11000CF90 ?
                                                   442442216B736800 ?
                                                   FFFFFFFFFFFBF00 ? 1102E04BC ?
                                                   1102D7D20 ? 0000006A0 ?
                                                   1102D83C0 ? 0000006A0 ?
opiino+0758          bl       opitsk               000000000 ? 000000000 ?
opiodr+08cc          bl       _ptrgl
opidrv+032c          bl       opiodr               3C00000018 ? 4101FAF78 ?
                                                   FFFFFFFFFFFF840 ? 0A000F350 ?
sou2o+0028           bl       opidrv               3C0C000000 ? 4A00E8B50 ?
                                                   FFFFFFFFFFFF840 ?
main+0138            bl       01FD3A28
__start+0098         bl       main                 000000000 ? 000000000 ?
--------------------- Binary Stack Dump ---------------------
Cursor Dump:
----------------------------------------
Cursor 1 (110360418): CURROW  curiob: 110369b78
 curflg: 46 curpar: 0 curusr: 0 curses 700000c376f5740
 cursor name: select nvl(max(cpmid),0) from x$kcccp                                        where cpsta = 2
 child pin: 0, child lock: 700000d9b9c5bb8, parent lock: 700000d088e0fa0
 xscflg: 1100024, parent handle: 70000031d588d88, xscfl2: 4040401
 bhp size: 160/600
----------------------------------------
Cursor 2 (110360468): CURBOUND  curiob: 1103656f0
 curflg: c7 curpar: 0 curusr: 0 curses 700000c376f5740
 cursor name: SELECT SUBSTR(VERSION,1,INSTR(VERSION,'.') - 1 )   FROM V$INSTANCE
 child pin: 0, child lock: 700000d21e60930, parent lock: 700000327837ce0
 xscflg: 141024, parent handle: 700000304e2f020, xscfl2: 4000401
 bhp size: 160/600
----------------------------------------
Cursor 3 (1103604b8): CURBOUND  curiob: 1103b6aa8
 curflg: c7 curpar: 0 curusr: 0 curses 700000c376f5740
 cursor name: SELECT SUBSTR(VERSION,1 + INSTR(VERSION,'.',1,1) ,INSTR(VERSION,'.',1,2) -
 INSTR(VERSION,'.',1,1)  - 1 )   FROM V$INSTANCE
 child pin: 0, child lock: 700000d5e382ee8, parent lock: 700000c93581d40
 xscflg: 141024, parent handle: 700000d73daa1c0, xscfl2: 4000401
 bhp size: 160/600
----------------------------------------
Cursor 4 (110360508): CURBOUND  curiob: 1103b66b8
 curflg: c7 curpar: 0 curusr: 0 curses 700000c376f5740
 cursor name: SELECT SUBSTR(VERSION,1 + INSTR(VERSION,'.',1,2) ,INSTR(VERSION,'.',1,3) -
 INSTR(VERSION,'.',1,2)  - 1 )   FROM V$INSTANCE
 child pin: 0, child lock: 700000d16de7978, parent lock: 700000c44059d30
 xscflg: 141024, parent handle: 700000259c4a700, xscfl2: 4000401
 bhp size: 160/600
----------------------------------------
Cursor 5 (110360558): CURBOUND  curiob: 1103b3868
 curflg: 46 curpar: 0 curusr: 0 curses 700000c376f5740
 cursor name: SELECT SYSDATE   FROM SYS.DUAL
 child pin: 0, child lock: 700000d589cea48, parent lock: 70000026b311fb0
 xscflg: 100024, parent handle: 700000d2eaee328, xscfl2: 4600409
 bhp size: 280/632
----------------------------------------
Cursor 6 (1103605a8): CURBOUND  curiob: 1103b3408
 curflg: 46 curpar: 0 curusr: 0 curses 700000c376f5740
 cursor name: SELECT TO_CHAR(SYSDATE,'YYYY','NLS_CALENDAR=Gregorian'),TO_CHAR(SYSDATE,'MM','NLS_CALENDAR=Gregorian'),
TO_CHAR(SYSDATE,'DD','NLS_CALENDAR=Gregorian') FROM X$DUAL
 child pin: 0, child lock: 70000033f1753c8, parent lock: 700000db8c6dd18
 xscflg: 100024, parent handle: 700000cbc6ad8b0, xscfl2: 4600409
 bhp size: 160/600
End of cursor dump
ksedmp: no current context area
----- Dump of the Fixed PGA -----

找到相关文档Note:262996.1,经过分析,产生错误的原因是由在本版本的数据库中SGA管理中存在的漏洞造成,但此错误没有对数据库的数据造成损坏及性能影响.

处理建议
1.当前版本ORACLE已经不再提供补丁支持,建议升级到高版本解决(有sr中介绍10.2中解决);
2.由于此报错并没有对数据库的数据及性能造成损坏及影响,可以忽此错误。

ORA-01075: you are currently logged on

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-01075: you are currently logged on

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

rm删除文件后alert中出现错误

Mon Apr 16 21:36:59 2012
Errors in file /home/oracle/oracle/admin/XGS/bdump/xgs_j000_1349.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01116: error in opening database file 3
ORA-01110: data file 3: '/home/oracle/oracle/oradata/XGS/sysaux01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3
ORA-01116: error in opening database file 3
ORA-01110: data file 3: '/home/oracle/oracle/oradata/XGS/sysaux01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3
ORA-01116: error in opening database file 6
ORA-01110: data file 6: '/home/oracle/oracle/oradata/XGS/undotbs02.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3

数据库进程还在运行

oracle     779     1  0 21:21 ?        00:00:01 ora_pmon_XGS
oracle     781     1  0 21:21 ?        00:00:10 ora_psp0_XGS
oracle     783     1  0 21:21 ?        00:00:00 ora_mman_XGS
oracle     785     1  0 21:21 ?        00:00:00 ora_dbw0_XGS
oracle     787     1  0 21:21 ?        00:00:00 ora_lgwr_XGS
oracle     789     1  0 21:21 ?        00:00:00 ora_ckpt_XGS
oracle     791     1  0 21:21 ?        00:00:00 ora_smon_XGS
oracle     793     1  0 21:21 ?        00:00:00 ora_reco_XGS
oracle     795     1  0 21:21 ?        00:00:00 ora_cjq0_XGS
oracle     797     1  0 21:21 ?        00:00:01 ora_mmon_XGS
oracle     799     1  0 21:21 ?        00:00:00 ora_mmnl_XGS
oracle     801     1  0 21:21 ?        00:00:00 ora_d000_XGS
oracle     803     1  0 21:21 ?        00:00:00 ora_s000_XGS

尝试登陆数据库

[oracle@dbtest ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.1.0 - Production on Mon Apr 16 21:40:06 2012
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
ERROR:
ORA-01075: you are currently logged on
Enter user-name: sys
Enter password:
ERROR:
ORA-00604: error occurred at recursive SQL level 2
ORA-01116: error in opening database file 1
ORA-01110: data file 1: '/home/oracle/oracle/oradata/XGS/system01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3
ORA-00604: error occurred at recursive SQL level 1
ORA-01116: error in opening database file 6
ORA-01110: data file 6: '/home/oracle/oracle/oradata/XGS/undotbs02.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3

问题原因

Internal triggers are trying to fire but one or more datafiles for the SYSAUX tablespace is offline,
this is preventing the database from allowing new connections.
NOTE: At this point, you cannot connect to verify the status in V$DATAFILE,
but you may find an indication of the offline datafile(s) in the alert.log file.
For example:
In one case, a media problem occurred which made disks unavailable.
This caused several files to be taken offline automatically including a SYSAUX datafile.

解决方法
kill进程,重启数据库到mount状态,然后根据特定情况恢复数据库或者online相关文件

记录一次ORA-00600[kdsgrp1]分析

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:记录一次ORA-00600[kdsgrp1]分析

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库版本

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Productio
NLSRTL Version 10.2.0.4.0 - Production

找出报错对象

--方法1
*** SESSION ID:(795.16405) 2012-04-05 09:36:11.958
            row 080095ee.26 continuation at
            file# 32 block# 38382 slot 39 not found
**************************************************
KDSTABN_GET: 0 ..... ntab: 1
curSlot: 39 ..... nrows: 19
**************************************************
SQL> SELECT OWNER, SEGMENT_NAME, SEGMENT_TYPE, TABLESPACE_NAME, A.PARTITION_NAME
  2    FROM DBA_EXTENTS A
  3   WHERE FILE_ID = &amp;amp;FILE_ID
  4     AND &amp;amp;BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1;
Enter value for file_id: 32
old   3:  WHERE FILE_ID = &amp;amp;FILE_ID
new   3:  WHERE FILE_ID = 32
Enter value for block_id: 38382
old   4:    AND &amp;amp;BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1
new   4:    AND 38382 BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1
OWNER
------------------------------
SEGMENT_NAME
--------------------------------------------------------------------------------
SEGMENT_TYPE       TABLESPACE_NAME                PARTITION_NAME
------------------ ------------------------------ ------------------------------
AHV8
TBL_IVR_LOG
TABLE PARTITION    CSS_PARTITION                  IVR_LOG_2012_MONTH04
--方法2
*** 2012-04-05 09:36:11.965
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
Current SQL statement for this session:
INSERT INTO TBL_CONTACT_INFO_FAILED_TMP
select * from TBL_IVR_LOG
SO: 70000017f954f50, type: 4, owner: 70000017f65a840, flag: INIT/-/-/0x00
(session) sid: 795 trans: 70000017464a1e8, creator: 70000017f65a840, flag: (40110041) USR/- BSY/-/-/-/-/-
              DID: 0002-0067-000305BD, short-term DID: 0002-0067-000305BE
              txn branch: 0
              oct: 2, prv: 0, sql: 70000015180ee98, psql: 700000180d67550, user: 49/AHV8
service name: SYS$USERS
O/S info: user: oracle10, term: UNKNOWN, ospid: 12976218, machine: zwq_kfdb2
              program: oracle@zwq_kfdb2 (J002)
last wait for 'db file sequential read' blocking sess=0x0 seq=226 wait_time=17071 seconds since wait started=1
                file#=20, block#=95ee, blocks=1
--方法3
Block header dump:  0x080095ee
 Object id on Block? Y
 seg/obj: 0x11eeb  csc: 0x6f2.848e814  itc: 2  flg: E  typ: 1 - DATA
     brn: 1  bdba: 0x7c09c89 ver: 0x01 opc: 0
     inc: 0  exflg: 0
SQL> select to_number('11eeb','xxxxxxxx') from dual;
TO_NUMBER('11EEB','XXXXXXXX')
-----------------------------
                        73451
SQL> select owner,object_name,subobject_name,object_type from dba_objects where data_object_id='73451';
OWNER
------------------------------
OBJECT_NAME
--------------------------------------------------------------------------------
SUBOBJECT_NAME                 OBJECT_TYPE
------------------------------ -------------------
AHV8
TBL_IVR_LOG
IVR_LOG_2012_MONTH04           TABLE PARTITION

验证是否真的坏块

SQL> select name from v$datafile where file#=32;
NAME
------------------------------------------------------
/dev/rdb1_data27
[zwq_kfdb2:/home/oraeye]dbv file='/dev/rdb1_data27' blocksize=8192
DBVERIFY: Release 10.2.0.4.0 - Production on Fri Apr 13 15:33:10 2012
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /dev/rdb1_data27
DBVERIFY - Verification complete
Total Pages Examined         : 1048448
Total Pages Processed (Data) : 947357
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 4756
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 96335
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 297329920 (1778.297329920)
SQL> select count(*) from AHV8.TBL_IVR_LOG partition(IVR_LOG_2012_MONTH04);
  COUNT(*)
----------
   8798030

总结:很明显这次出现这个问题,因为内存中出现坏块导致,经过一段时间buffer cache中的坏块内容已经被老化,所以现在不能重现(甚至不用做任何操作)。如果内存中出现了坏块,而且还没有被老化掉,可以刷新data buffer;如果是数据块出现坏块,根据实际情况决定处理

ORA-00600[729]分析和处理方法

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-00600[729]分析和处理方法

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

alert中ORA-00600[729]

Fri Apr  6 04:30:04 2012
Errors in file /oracle9/app/admin/crm/udump/crm2_ora_2548236.trc:
ORA-00600: internal error code, arguments: [729], [1067976], [space leak], [], [], [], [], []

a. the first bracketed number [729] is the common argument for space leak problems.
b. the second number [1067976] is the number of bytes leaked by the error.
c. the third argument is always [space leak].

分析trace文件

*** 2012-04-06 04:30:04.656
*** SESSION ID:(1361.35607) 2012-04-06 04:30:04.648
******** ERROR: UGA memory leak detected 1067976 ********
******************************************************
HEAP DUMP heap name="session heap"  desc=1103a05f0

a. the memory was leaked from the UGA area
b. the amount leaked is reported again in the text (1067976 bytes).
c. the above few lines describe this dump as SESSION HEAP with the descriptor 0x1103a05f0.

计算泄露内存大小

******************************************************
HEAP DUMP heap name="session heap"  desc=1103a05f0
 extent sz=0xff80 alt=32767 het=32767 rec=0 flg=3 opc=3
 parent=110009628 owner=700000c3b6f5620 nex=0 xsz=0xff80
EXTENT 0 addr=1107dbf50
  Chunk        1107dbf60 sz=    65392    free      "               "
EXTENT 1 addr=1107cbf50
  Chunk        1107cbf60 sz=    65392    free      "               "
EXTENT 2 addr=110541da0
  Chunk        110541db0 sz=    61312    free      "               "
EXTENT 3 addr=11062ae88
  Chunk        11062ae98 sz=   266264    freeable  "kllcqgf:kllsltb"
EXTENT 4 addr=1105dae88
  Chunk        1105dae98 sz=   266264    freeable  "kllcqgf:kllsltb"
EXTENT 5 addr=110550d48
  Chunk        110550d58 sz=   266264    freeable  "kllcqgf:kllsltb"
EXTENT 6 addr=110500d48
  Chunk        110500d58 sz=   266264    freeable  "kllcqgf:kllsltb"
EXTENT 7 addr=1104e1df0
  Chunk        1104e1e00 sz=      200    perm      "perm           "  alo=200
  Chunk        1104e1ec8 sz=    65192    free      "               "
EXTENT 8 addr=1104c1df0
  Chunk        1104c1e00 sz=    40720    perm      "perm           "  alo=40720
  Chunk        1104cbd10 sz=       56    free      "               "
  Chunk        1104cbd48 sz=      408    freeable  "kcbl_structure_"
  Chunk        1104cbee0 sz=     6952    free      "               "
  Chunk        1104cda08 sz=     2424    freeable  "kllcqc:kllcqslt"
  Chunk        1104ce380 sz=    14832    free      "               "
EXTENT 9 addr=1104d1df0
  Chunk        1104d1e00 sz=    65392    free      "               "
EXTENT 10 addr=1104b1df0
  Chunk        1104b1e00 sz=      544    free      "               "
  Chunk        1104b2020 sz=       88    freeable  "kllcqc:kllcq   "
  Chunk        1104b2078 sz=    64760    free      "               "
EXTENT 11 addr=110427390
  Chunk        1104273a0 sz=    65392    free      "               "
EXTENT 12 addr=110417390
  Chunk        1104173a0 sz=    65392    free      "               "
EXTENT 13 addr=110407390
  Chunk        1104073a0 sz=    65392    free      "               "
EXTENT 14 addr=1103f7390
  Chunk        1103f73a0 sz=    65392    free      "               "
EXTENT 15 addr=1103e7390
  Chunk        1103e73a0 sz=    65392    free      "               "
EXTENT 16 addr=1103d7390
  Chunk        1103d73a0 sz=    65392    free      "               "
EXTENT 17 addr=1103c7390
  Chunk        1103c73a0 sz=      408    free      "               "
  Chunk        1103c7538 sz=     2232    perm      "perm           "  alo=2232
  Chunk        1103c7df0 sz=    62752    free      "               "
EXTENT 18 addr=1103b7390
  Chunk        1103b73a0 sz=    65392    free      "               "
EXTENT 19 addr=110370080
  Chunk        110370090 sz=     2008    perm      "perm           "  alo=2008
  Chunk        110370868 sz=    63384    free      "               "
EXTENT 20 addr=110360098
  Chunk        1103600a8 sz=    20424    perm      "perm           "  alo=20424
  Chunk        110365070 sz=    44944    free      "               "
Total heap size    =  2172616
FREE LISTS:
 Bucket 0 size=56
  Chunk        1104cbd10 sz=       56    free      "               "
 Bucket 1 size=88
 Bucket 2 size=152
 Bucket 3 size=168
 Bucket 4 size=280
  Chunk        1103c73a0 sz=      408    free      "               "
 Bucket 5 size=432
 Bucket 6 size=536
  Chunk        1104b1e00 sz=      544    free      "               "
 Bucket 7 size=1048
 Bucket 8 size=2072
 Bucket 9 size=4120
  Chunk        1104cbee0 sz=     6952    free      "               "
 Bucket 10 size=8216
  Chunk        1104ce380 sz=    14832    free      "               "
 Bucket 11 size=16408
 Bucket 12 size=32792
  Chunk        110365070 sz=    44944    free      "               "
  Chunk        110370868 sz=    63384    free      "               "
  Chunk        1104d1e00 sz=    65392    free      "               "
  Chunk        1103b73a0 sz=    65392    free      "               "
  Chunk        1103c7df0 sz=    62752    free      "               "
  Chunk        1103d73a0 sz=    65392    free      "               "
  Chunk        1103f73a0 sz=    65392    free      "               "
  Chunk        1104073a0 sz=    65392    free      "               "
  Chunk        1104b2078 sz=    64760    free      "               "
  Chunk        1103e73a0 sz=    65392    free      "               "
  Chunk        1104e1ec8 sz=    65192    free      "               "
  Chunk        1104273a0 sz=    65392    free      "               "
  Chunk        1104173a0 sz=    65392    free      "               "
  Chunk        1107cbf60 sz=    65392    free      "               "
  Chunk        110541db0 sz=    61312    free      "               "
  Chunk        1107dbf60 sz=    65392    free      "               "
 Bucket 13 size=65560
 Bucket 14 size=131096
 Bucket 15 size=262168
 Bucket 16 size=524312
 Bucket 17 size=2097176
Total free space   =  1039056
UNPINNED RECREATABLE CHUNKS (lru first):
PERMANENT CHUNKS:
  Chunk        1104e1e00 sz=      200    perm      "perm           "  alo=200
  Chunk        1104c1e00 sz=    40720    perm      "perm           "  alo=40720
  Chunk        1103c7538 sz=     2232    perm      "perm           "  alo=2232
  Chunk        110370090 sz=     2008    perm      "perm           "  alo=2008
  Chunk        1103600a8 sz=    20424    perm      "perm           "  alo=20424
Permanent space    =    65584
******************************************************

FREEABLE and RECREATABLE chunks总和等于1067976 byte(leaked memory)

会话状态分析

*** 2012-04-06 04:30:04.658
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [729], [1067976], [space leak], [], [], [], [], []
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp+0148          bl       ksedst               1029746FC ?
ksfdmp+0018          bl       01FD4014
kgeriv+0118          bl       _ptrgl
kgesiv+0080          bl       kgeriv               000000001 ? 000000002 ?
                                                   1100610D0 ? 000000000 ?
                                                   00000000A ?
ksesic2+005c         bl       kgesiv               FFFFFFFFFFF9320 ? 1101FAF78 ?
                                                   110006308 ? 1103A0818 ?
                                                   000000009 ?
ksmuhe+026c          bl       ksesic2              2D9000002D9 ? 000000000 ?
                                                   000104BC8 ? 000000001 ?
                                                   00000000A ? 103164968 ?
                                                   12E0BE826D694B2F ?
                                                   000000000 ?
ksmugf+0214          bl       ksmuhe               110002A20 ? 110061238 ?
                                                   000000009 ? 102975DE8 ?
ksuxds+170c          bl       ksmugf               000000000 ? 020000000 ?
                                                   1029754D0 ?
ksudel+006c          bl       ksuxds               700000C3B6F5620 ? 100000001 ?
opilof+03dc          bl       01FD427C             <--表示logoff
opiodr+08cc          bl       _ptrgl
ttcpip+0cc4          bl       _ptrgl
opitsk+0d60          bl       ttcpip               11000CF90 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
opiino+0758          bl       opitsk               000000000 ? 000000000 ?
opiodr+08cc          bl       _ptrgl
opidrv+032c          bl       opiodr               3C00000018 ? 4101FAF78 ?
                                                   FFFFFFFFFFFF7B0 ? 0A000F350 ?
sou2o+0028           bl       opidrv               3C0C000000 ? 4A00E8B50 ?
                                                   FFFFFFFFFFFF7B0 ?
main+0138            bl       01FD3A28
__start+0098         bl       main                 000000000 ? 000000000 ?
--------------------- Binary Stack Dump ---------------------
………………
 ----------------------------------------
    SO: 700000c3b6f5620, type: 4, owner: 700000c3c987a18, flag: INIT/-/-/0x00
--flag: (41) USR/- BSY/-/-/DEL/-/- shows that the session has been deleted
    (session) trans: 0, creator: 700000c3c987a18, flag: (41) USR/- BSY/-/-/DEL/-/-
              DID: 0002-0927-01D67CAD, short-term DID: 0000-0000-00000000
              txn branch: 0
              oct: 0, prv: 0, sql: 700000caf2c0e30, psql: 700000caf2c0e30, user: 52/MONITOR
    O/S info: user: oracrm, term: , ospid: 1490968, machine: zwq_crm2
              program: exp@zwq_crm2 (TNS V1-V3)
    last wait for 'SQL*Net message from client' blocking sess=0x0 seq=59222 wait_time=1537
                driver id=54435000, #bytes=1, =0
    temporary object counter: 0
    ----------------------------------------

a.在logoff的时候发生UGA中的session heap发生内存泄露
b.该进程是一个exp导出数据库程序,并且该程序已经被释放

出现ORA-00600[729]原因

Memory leak problems generally occur when Oracle is trying to free memory allocated to a process.
The memory leak dump is generally discovered during session logoff,
when Oracle frees the heaps that are allocated for the user process.
When a user connects to Oracle, a user process is created and at that time the heap is allocated.
Every process will have its own memory heap.
The memory is organized in to heaps and every heap consists of one or more extents.
Each extent contains a series of contiguous memory chunks, and these chunks can be
either FREE or ALLOCATED. The Generic Heap Manager takes care of allocating and deallocating
 the memory chunks, with the help of FREE LISTS and LRU LISTS.
Chunk types are as follows:
1. FREE
2. FREEABLE
3. RECREATABLE
4. PERMANENT
5. FREEABLE WITH MARK
It is not mandatory that each extent contain only one type of chunk.
Extents can contain various types of chunks. When processes require memory chunks,
they are allocated as needed. Oracle keeps track of the amount of memory allocated for the process internally.
When the process terminates, all of the memory that has been allocated for the process is automatically released.
When the memory is released the allocated heaps are freed. Generally,
when the heap is freed the only chunks that the process should identify
as allocated are the PERMANENT chunks and FREE chunks on the freelist.
If the process finds there are still FREEABLE or RECREATABLE chunks remaining,
then the process has not properly deallocated the memory.
This situation is considered a space leak.

ORA-00600[729]处理方案

1. If there are no other errors reported at the same time,
this may be a case where the error was a rare occurrence and can be safely ignored.
As a rule of thumb, leaks less than 90,000 bytes in size are considered to be of low significance.
The solution in this case is to set event 10262 (see below).
a. Set the following event in init.ora parameter file.
   This example disables reporting for space leaks less than 90000 bytes:
event = "10262 trace name context forever, level 90000"
b. Stop and restart the database.
If the level is set to 1, space leak checking is disabled.
This is not advised because large memory leaks will be missed.
If the event is set to a value greater than 1,
any space leak up to the number specified in the event is ignored.
2. Is the leak in the SGA? The alert.log should be reviewed for additional
errors such as ORA-4030 and ORA-4031 to ensure there are no additional
problems with the shared pool or operating system memory.
3. Does the error reproduce with a given task? If so, this is
a case that should be investigated further because the leak could be a known bug.
See Note 31056.1 ORA-600 [729] UGA Space Leak for a list of known bugs and fixes.

参考:Understanding and Diagnosing ORA-600 [729] Space Leak Errors [ID 403584.1]