帮朋友分析日志的过程中发现在两个节点上多次出现类此错误
节点1日志分析
Mon Mar 5 08:19:20 2012
Errors in file /u01/app/oracle/admin/hddms/udump/hddms1_ora_18556.trc:
ORA-07445: exception encountered: core dump [__milli_memcpy()+2448] [SIGSEGV] [Address not mapped to object] [0x9FFFFFFFBF580000] [] []
Mon Mar 5 08:19:21 2012
Errors in file /u01/app/oracle/admin/hddms/udump/hddms1_ora_18556.trc:
ORA-07445: exception encountered: core dump [kghalf()+961] [SIGSEGV] [Invalid permissions for mapped object] [0x000000000] [] []
ORA-07445: exception encountered: core dump [__milli_memcpy()+2448] [SIGSEGV] [Address not mapped to object] [0x9FFFFFFFBF580000] [] []
Mon Mar 5 08:19:22 2012
Errors in file /u01/app/oracle/admin/hddms/udump/hddms1_ora_18556.trc:
ORA-07445: exception encountered: core dump [kghalf()+464] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []
ORA-07445: exception encountered: core dump [kghalf()+961] [SIGSEGV] [Invalid permissions for mapped object] [0x000000000] [] []
ORA-07445: exception encountered: core dump [__milli_memcpy()+2448] [SIGSEGV] [Address not mapped to object] [0x9FFFFFFFBF580000] [] []
对应trace文件
/u01/app/oracle/admin/hddms/udump/hddms1_ora_18556.trc
Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production
ORACLE_HOME = /u01/app/oracle/product/9.2.0
System name: HP-UX
Node name: HDDB1
Release: B.11.23
Version: U
Machine: ia64
Instance name: hddms1
Redo thread mounted by this instance: 1
Oracle process number: 42
Unix process pid: 18556, image: oracle@HDDB1 (TNS V1-V3)
*** 2012-03-05 08:19:20.675
*** SESSION ID:(100.40701) 2012-03-05 08:19:20.602
Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object), addr: 0x9fffffffbf580000, PC: [0x400000000182a9f0, __milli_memcpy()+2448]
r1: 6000000000524900 r20: 8 br5: 0
r2: 0 r21: 9fffffffbf4c21ec br6: 40000000018ad000
r3: 37 r22: 8 br7: e000000120001400
r4: 0 r23: c00000005e3bac98 ip: 400000000182a9f0
r5: 60000000005311d8 r24: 9fffffffbf4c2268 iipa: 0
r6: 40000004 r25: 9fffffffbf4c2268 cfm: 2a54295428
r7: 9fffffffffff1ba8 r26: 9fffffffbf580000 um: 1a
r8: 9fffffffbf4c2268 r27: 9fffffffbf57ff00 rsc: 1f
r9: 8000 r28: 9fffffffbf580008 bsp: 9fffffffbf802150
r10: 9fffffffbf4ba260 r29: 9fffffffbf57ff08 bspstore: 9fffffffbf802150
r11: 0 r30: 0 rnat: 0
r12: 9ffffffffffe5240 r31: 0 ccv: 20000000
r13: 9fffffffbf5a1420 NaTs: 0 unat: 0
r14: 9fffffffbf4c23e8 PRs: ffffffffffff4837 fpsr: 9804c8274433f
r15: 9fffffffbf580028 br0: 400000000202a7e0 pfs: c000000000001b3e
r16: 3fffffffffffdff br1: 0 lc: 3ffffffffffce89
r17: 9fffffffbf4c2258 br2: 0 ec: 5
r18: 0 br3: 0 isr: 9fffffffbf802150
r19: 80 br4: 0 ifa: 0
Reason code: 0008
*** 2012-03-05 08:19:20.735
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [__milli_memcpy()+2448] [SIGSEGV] [Address not mapped to object] [0x9FFFFFFFBF580000] [] []
Current SQL statement for this session:
update pmis.PM_BZ_DC_DWRFHSJ set ZGFDFH=:v_result where PARINST_ID=:v_pid and T_ID>:v_mdtbegin and T_ID<=:v_mdtend
----- PL/SQL Call Stack -----
object line object
handle number name
c00000008975ebc0 49 procedure PMIS.PM_DC_SJJGGC_EXECINF
c0000000971ef530 2 anonymous block
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp()+528 call ksedst() 000000001 ?
C050000000000996 ?
4000000002A53E60 ?
ssexhd()+1184 call ksedmp() 000000003 ?
C0500000000010A9 ?
40000000025E1420 ?
FFFFFFFFFFFF9075 ?
9FFFFFFFFFFDF2E0 ?
6000000000514428 ?
6000000000514420 ?
60000000005178E0 ?
<kernel> call ssexhd() 400000000086EFB0 ?
000000015 ? 000000007 ?
C050000000000085 ?
__milli_memcpy()+24 call <kernel>
48
$cold_updgrh()+1088 call __milli_memcpy() 000000000 ? 000000000 ?
000000000 ? 000000000 ?
9FFFFFFFBF564348 ?
000000000 ? 000000000 ?
000000000 ?
upduaw()+224 call $cold_updgrh() C0000000A2665550 ?
9FFFFFFFBF558FA0 ?
C000000000000C1E ?
4000000001916600 ?
000000000 ? 00000E137 ?
kdusru()+720 call upduaw() C0000000A2665550 ?
C000000000005BBF ?
4000000001912AC0 ?
00000E135 ?
9FFFFFFFFFFE52B0 ?
C0000000A2665660 ?
9FFFFFFFFFFE63F0 ?
9FFFFFFFBF5590D8 ?
kauupd()+544 call kdusru() 9FFFFFFFBF4C5D24 ?
000000000 ?
9FFFFFFFBF558FA0 ?
000000000 ?
C00000008B5D1108 ?
C00000008B5D1138 ?
000000018 ?
9FFFFFFFBF5591E0 ?
updrow()+3024 call kauupd() 9FFFFFFFFFFE65D0 ?
000000000 ?
6000000000530010 ?
000000000 ?
C00000009202CF38 ?
00000010B ? 00001AA7A ?
C000000086F2B778 ?
qerupFetch()+608 call updrow() 9FFFFFFFBF4C89F0 ?
C0000000A2665550 ?
C000000000000C1D ?
40000000018F5BF0 ?
000000000 ?
updaul()+1008 call qerupFetch() C0000000A0DB24D0 ?
000000000 ?
C0000000A2665550 ?
000007FFF ?
updThreePhaseExe()+ call updaul() C000000091A57FA0 ?
432 000000006 ? 000000000 ?
C0000000000019BB ?
40000000019DF9B0 ?
000008E23 ?
60000000001D30F0 ?
0000000BD ?
updexe()+624 call updThreePhaseExe() C000000091A57FA0 ?
000000000 ?
9FFFFFFFBF558FA0 ?
9FFFFFFFFFFF1AF0 ?
C000000000000FA6 ?
40000000019DFCA0 ?
000000000 ?
opiexe()+6944 call updexe() C000000091A57FA0 ?
9FFFFFFFBF4C8A00 ?
C000000000002858 ?
4000000001857620 ?
000008E25 ?
9FFFFFFFFFFF1AD0 ?
9FFFFFFFBF558FA0 ?
9FFFFFFFBF4C88D4 ?
opipls()+3488 call opiexe() 000000004 ?
9FFFFFFFFFFF2F50 ?
9FFFFFFFFFFF1D30 ?
C000000000002C60 ?
40000000019852D0 ?
000000000 ? 000007123 ?
9FFFFFFFFFFF1B40 ?
opiodr()+3088 call opipls() 9FFFFFFFFFFF2F50 ?
9FFFFFFFFFFF2F58 ?
000000004 ?
C00000000000214A ?
40000000018747B0 ?
00000E123 ?
9FFFFFFFFFFF2E70 ?
9FFFFFFFFFFF2DE0 ?
rpidrus()+304 call opiodr() 9FFFFFFFFFFF3810 ?
9FFFFFFFFFFF37E0 ?
9FFFFFFFFFFF36F8 ?
9FFFFFFFFFFF36F4 ?
skgmstack()+288 call rpidrus() 9FFFFFFFFFFF4DA0 ?
C000000000000716 ?
4000000001862FE0 ?
9FFFFFFFFFFF47F0 ?
600000000004F3C0 ?
9FFFFFFFFFFF4DB8 ?
6000000000524900 ?
9FFFFFFFFFFF4DA0 ?
rpidru()+256 call skgmstack() 9FFFFFFFFFFF4DC0 ?
600000000004EDF0 ?
00000F618 ?
400000000129E400 ?
9FFFFFFFFFFF4DA0 ?
rpiswu2()+784 call rpidru() 00000057B ?
C000000000001329 ?
4000000001870CE0 ?
rpidrv()+2096 call rpiswu2() C00000007D7B3708 ?
9FFFFFFFFFFF5510 ?
6000000000515CE8 ?
60000000000502E8 ?
9FFFFFFFFFFF5520 ?
000000205 ?
6000000000530010 ?
6000000000531090 ?
psddr0()+256 call rpidrv() 000000000 ? 000000066 ?
9FFFFFFFFFFF6630 ?
00000003A ?
psdnal()+672 call psddr0() 000000000 ? 000000066 ?
9FFFFFFFFFFF6630 ?
000000030 ?
pevm_EXIM()+400 call psdnal() 9FFFFFFFFFFF84A0 ?
9FFFFFFFFFFF8440 ?
6000000000531A04 ?
600000000064BAA0 ?
C000000090880CC8 ?
600000000063EA80 ?
pfrrun()+37120 call pevm_EXIM() 9FFFFFFFBF567AD8 ?
C000000090880CC8 ?
C000000000001532 ?
400000000173DAD0 ?
000006269 ?
4000000002190B40 ?
9FFFFFFFBF4BA068 ?
9FFFFFFFBF5C08A0 ?
peicnt()+544 call pfrrun() 9FFFFFFFBF567CDC ?
C000000000000D1D ?
4000000001676FE0 ?
kkxexe()+832 call peicnt() 9FFFFFFFFFFF84A0 ?
9FFFFFFFBF567AD8 ?
C000000000000818 ?
40000000021A3A80 ?
00000FEAB ?
9FFFFFFFFFFF7EB0 ?
9FFFFFFFBF567CDC ?
000000000 ?
opiexe()+12592 call kkxexe() 9FFFFFFFFFFF8430 ?
C000000000002858 ?
4000000001858C30 ?
0000080AB ?
9FFFFFFFFFFF8420 ?
9FFFFFFFBF4F3A6C ?
9FFFFFFFBF4F3A68 ?
9FFFFFFFFFFF84A0 ?
opiall0()+3456 call opiexe() 000000004 ?
9FFFFFFFFFFF9A80 ?
9FFFFFFFFFFF8830 ?
C0000000000024D1 ?
40000000017A3350 ?
000000000 ? 00000C82B ?
9FFFFFFFFFFF8640 ?
kpoal8()+2272 call opiall0() 000000002 ?
9FFFFFFFFFFF9A10 ?
9FFFFFFFFFFFA2B4 ?
000000000 ?
9FFFFFFFFFFF9954 ?
9FFFFFFFFFFFA2BC ?
000000000 ?
FFFFFFFFFFFFDFFF ?
opiodr()+3088 call kpoal8() 9FFFFFFFFFFFA0B0 ?
000000000 ? 000000000 ?
C00000000000214A ?
40000000018747B0 ?
00000F82F ?
9FFFFFFFFFFFA130 ?
9FFFFFFFFFFFA1D0 ?
ttcpip()+1888 call opiodr() 9FFFFFFFFFFFA460 ?
9FFFFFFFFFFFA430 ?
9FFFFFFFFFFFA348 ?
9FFFFFFFFFFFA344 ?
C0000000000018B9 ?
400000000172CB70 ?
000000000 ? 00000C0AB ?
opitsk()+1920 call ttcpip() 6000000000052C40 ?
000000002 ?
9FFFFFFFFFFFB950 ?
6000000000273AA0 ?
9FFFFFFFFFFFBAB0 ?
9FFFFFFFFFFFB8C4 ?
9FFFFFFFBF75A660 ?
4000000000D2C7AA ?
opiino()+2656 call opitsk() 000000000 ? 000000000 ?
C000000000000D1F ?
400000000236CE90 ?
000000000 ?
opiodr()+3088 call opiino() 60000000005DEC18 ?
6000000000531170 ?
9FFFFFFFFFFFF4C0 ?
C00000000000214A ?
40000000018747B0 ?
00000E825 ?
9FFFFFFFFFFFD7C0 ?
9FFFFFFFFFFFF4D0 ?
opidrv()+1088 call opiodr() 9FFFFFFFFFFFDF10 ?
9FFFFFFFFFFFDEE0 ?
9FFFFFFFFFFFDDF8 ?
9FFFFFFFFFFFDDF4 ?
sou2o()+48 call opidrv() 9FFFFFFFFFFFEF60 ?
000000004 ?
9FFFFFFFFFFFF4C0 ?
main()+352 call sou2o() 9FFFFFFFFFFFF4E0 ?
00000003C ? 000000004 ?
9FFFFFFFFFFFF4C0 ?
main_opd_entry()+80 call main() 000000000 ?
9FFFFFFFFFFFF9B0 ?
C000000000000004 ?
C00000000002FA60 ?
--------------------- Binary Stack Dump ---------------------
观察节点2,发现类此错误不同之处有
Wed Mar 7 08:19:11 2012
Errors in file /u01/app/oracle/admin/hddms/udump/hddms2_ora_16729.trc:
ORA-07445: exception encountered: core dump [__milli_memcpy()+2496] [SIGSEGV] [Address not mapped to object] [0x9FFFFFFFBF580000] [] []
Wed Mar 7 08:19:12 2012
Errors in file /u01/app/oracle/admin/hddms/udump/hddms2_ora_16729.trc:
ORA-00600: internal error code, arguments: [kghfrempty:ds], [0x9FFFFFFFBF55A550], [], [], [], [], [], []
ORA-07445: exception encountered: core dump [__milli_memcpy()+2496] [SIGSEGV] [Address not mapped to object] [0x9FFFFFFFBF580000] [] []
Wed Mar 7 08:19:13 2012
Errors in file /u01/app/oracle/admin/hddms/udump/hddms2_ora_16729.trc:
ORA-07445: exception encountered: core dump [kgidmp()+1568] [SIGSEGV] [Address not mapped to object] [0xB38F00000000D9] [] []
ORA-00600: internal error code, arguments: [kghfrempty:ds], [0x9FFFFFFFBF55A550], [], [], [], [], [], []
ORA-07445: exception encountered: core dump [__milli_memcpy()+2496] [SIGSEGV] [Address not mapped to object] [0x9FFFFFFFBF580000] [] []
Wed Mar 7 08:19:14 2012
Errors in file /u01/app/oracle/admin/hddms/udump/hddms2_ora_16729.trc:
ORA-07445: exception encountered: core dump [kghalf()+993] [SIGSEGV] [Address not mapped to object] [0xC0000000A8B01DA0] [] []
ORA-07445: exception encountered: core dump [kgidmp()+1568] [SIGSEGV] [Address not mapped to object] [0xB38F00000000D9] [] []
ORA-00600: internal error code, arguments: [kghfrempty:ds], [0x9FFFFFFFBF55A550], [], [], [], [], [], []
ORA-07445: exception encountered: core dump [__milli_memcpy()+2496] [SIGSEGV] [Address not mapped to object] [0x9FFFFFFFBF580000] [] []
对应trace文件不同之处
*** 2012-03-07 08:19:11.582
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [__milli_memcpy()+2496] [SIGSEGV] [Address not mapped to object] [0x9FFFFFFFBF580000] [] []
No current SQL statement being executed.
----- PL/SQL Call Stack -----
object line object
handle number name
c000000090c4a818 1356 procedure PMIS.PM_DC_SJJGGC_QDX
c0000000a25512c0 43 procedure PMIS.PM_DC_SJJGGC_EXECINF
c00000009a8306e8 2 anonymous block
通过这两个节点的错误记录,都包含PMIS.PM_DC_SJJGGC_EXECINF和anonymous,所以初步怀疑是某个程序在这个时间定运行某个任务(包含PMIS.PM_DC_SJJGGC_EXECINF过程)导致。
继续观察trace日志发现两个节点都是类此情况相同的程序,相同的主机
SO: c00000007d7b3708, type: 4, owner: c00000007d6d0190, flag: INIT/-/-/0x00
(session) trans: c000000082974470, creator: c00000007d6d0190, flag: (100041) USR/- BSY/-/-/-/-/-
DID: 0001-002A-0000693D, short-term DID: 0000-0000-00000000
txn branch: 0000000000000000
oct: 6, prv: 0, sql: c0000000a229ab38, psql: c000000084e6b510, user: 517/PMIS
O/S info: user: Administrator, term: DMIS_SWITCH2, ospid: 5600:3020, machine: WORKGROUP\DMIS_SWITCH2
program: PMTDCtrl.exe
application name: PMTDCtrl.exe, hash value=0
last wait for 'enqueue' blocking sess=0x0 seq=32849 wait_time=127
name|mode=54540004, id1=1, id2=10
temporary object counter: 0
现在更加怀疑是PMTDCtrl.exe运行某过程导致数据库出现这些错误
错误总结
查看MOS发现该问题是一个bug导致(Bug 6166690:ORA-7445 [MILLI_MEMCPY] WHEN REF CURSOR AS OUT PARAMETER)
错误原因:It is caused by a Dynamic SQL used to obtain a REF CURSOR from a procedure (that opens the cursor).
处理建议:通过个bug,我们可以分析PMIS.PM_DC_SJJGGC_EXECINF过程,看看是否符合bug描述,如果符合建议改写过程或者打上补丁(Patch 2709343)
补丁:p2709343_92080_HPUX-IA64