几个月以前的一个数据库故障,今天拿出来在win上重新分析,数据库启动报ORA-600 6711错
C:\Users\XFF>SQLPLUS / AS SYSDBA
SQL*Plus: Release 12.1.0.2.0 Production on 星期日 7月 14 16:17:32 2024
Copyright (c) 1982, 2014, Oracle. All rights reserved.
已连接到空闲例程。
SQL> startup mount pfile='d:/pfile.txt'
ORACLE 例程已经启动。
Total System Global Area 6442450944 bytes
Fixed Size 6205768 bytes
Variable Size 1493175992 bytes
Database Buffers 4932501504 bytes
Redo Buffers 10567680 bytes
数据库装载完毕。
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [6711], [4436379], [1], [4436389],
[0], [], [], [], [], [], [], []
进程 ID: 44144
会话 ID: 67 序列号: 39084
根据经验该报错为:ORA-600 [6711] “Cluster Key Chain corruption”,也就是说很可能是cluster相关对象异常导致该问题.
对启动过程进行跟踪
PARSING IN CURSOR #17695456 len=189 dep=4 tim=233428646426 hv=186852205 ad='7ffda1eea168' sqlid='2tkw12w5k68vd'
select user#,password,datats#,tempts#,type#,defrole,resource$, ptime,
decode(defschclass,NULL,'DEFAULT_CONSUMER_GROUP',defschclass),
spare1,spare4,ext_username,spare2 from user$ where name=:1
END OF STMT
PARSE #17695456:c=0,e=168,p=0,cr=0,cu=0,mis=1,r=0,dep=4,og=4,plh=0,tim=233428646426
BINDS #17695456:
Bind#0
oacdty=01 mxl=32(03) mxlc=00 mal=00 scl=00 pre=00
oacflg=18 fl2=0001 frm=01 csi=871 siz=32 off=0
kxsbbbfp=010b2df0 bln=32 avl=03 flg=05
value="SYS"
EXEC #17695456:c=0,e=418,p=0,cr=0,cu=0,mis=1,r=0,dep=4,og=4,plh=1457651150,tim=233428646901
WAIT #17695456: nam='db file sequential read' ela= 126 file#=1 block#=417 blocks=1 obj#=46 tim=233428647046
FETCH #17695456:c=0,e=153,p=1,cr=2,cu=0,mis=0,r=1,dep=4,og=4,plh=1457651150,tim=233428647069
STAT #17695456 id=1 cnt=1 pid=0 pos=1 obj=22 op='TABLE ACCESS BY INDEX ROWID USER$
(cr=2 pr=1 pw=0 time=151 us cost=1 size=139 card=1)'
STAT #17695456 id=2 cnt=1 pid=1 pos=1 obj=46 op='INDEX UNIQUE SCAN I_USER1 (cr=1 pr=1 pw=0 time=149 us)'
CLOSE #17695456:c=0,e=2,dep=4,type=0,tim=233428647111
Incident 2601 created, dump file: C:\APP\XFF\diag\rdbms\ecp\ecp\incident\incdir_2601\ecp_ora_40516_i2601.trc
ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []
FETCH #15289752:c=2062500,e=2544215,p=13,cr=65626,cu=28,mis=0,r=0,dep=3,og=3,plh=3312420081,tim=233431176536
=====================
PARSE ERROR #387363008:len=50 dep=1 uid=0 oct=3 lid=0 tim=233431176680 err=600
select cost from resource_cost$ where resource#=:1
ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []
ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []
这个操作触发了递归查询
PARSING IN CURSOR #387319440 len=151 dep=5 lid=0 tim=233428641503 hv=2507062328 ad='7ffd9ffa23a8' sqlid='7u49y06aqxg1s'
select /*+ rule */ bucket, endpoint, col#, epvalue, epvalue_raw, ep_repeat_count from histgrm$
where obj#=:1 and intcol#=:2 and row#=:3 order by bucket
END OF STMT
PARSE #387319440:c=0,e=11,p=0,cr=0,cu=0,mis=0,r=0,dep=5,og=3,plh=3312420081,tim=233428641503
BINDS #387319440:
Bind#0
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=00 fl2=1000001 frm=00 csi=00 siz=72 off=0
kxsbbbfp=00eb2be0 bln=22 avl=02 flg=05
value=22
Bind#1
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=00 fl2=1000001 frm=00 csi=00 siz=0 off=24
kxsbbbfp=00eb2bf8 bln=22 avl=02 flg=01
value=2
Bind#2
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=00 fl2=1000001 frm=00 csi=00 siz=0 off=48
kxsbbbfp=00eb2c10 bln=22 avl=01 flg=01
value=0
EXEC #387319440:c=0,e=105,p=0,cr=0,cu=0,mis=0,r=0,dep=5,og=3,plh=3312420081,tim=233428641652
WAIT #387319440: nam='db file sequential read' ela= 124 file#=1 block#=45660 blocks=1 obj#=66 tim=233428641792
FETCH #387319440:c=0,e=173,p=1,cr=3,cu=0,mis=0,r=20,dep=5,og=3,plh=3312420081,tim=233428641834
STAT #387319440 id=1 cnt=20 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=3 pr=1 pw=0 time=169 us cost=0 size=0 card=0)'
STAT #387319440 id=2 cnt=20 pid=1 pos=1 obj=66 op='TABLE ACCESS CLUSTER HISTGRM$ (cr=3 pr=1 pw=0 time=148 us)'
STAT #387319440 id=3 cnt=1 pid=2 pos=1 obj=65 op='INDEX UNIQUE SCAN I_OBJ#_INTCOL# (cr=2 pr=0 pw=0 time=2 us)'
CLOSE #387319440:c=0,e=36,dep=5,type=3,tim=233428641886
查看对应的trace文件
[TOC00000]
Jump to table of contents
Dump continued from file: C:\APP\XFF\diag\rdbms\ecp\ecp\trace\ecp_ora_40516.trc
[TOC00001]
ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []
[TOC00001-END]
[TOC00002]
========= Dump for incident 2601 (ORA 600 [6711]) ========
[TOC00003]
----- Beginning of Customized Incident Dump(s) -----
kdsDumpState: cdb: 0 dspdb: 0 type: 3
*** ENTER: kds state dump ***
row 0x0043b1a5.28 continuation at: 0x0043b1a5.0 file# 1 block# 242085 slot 0 (dscnt: 0)
KDSTABN_GET: 1 ..... ntab: 2
curSlot: 0 ..... nrows: 40
Dumping kcb descriptor:
kcbds 0x0000000017100DF0 : tsn 0, rdba 0x0043b1a5, afn 1, objd 64, cls 1, tidflg 0x0 0x0 0x0
dsflg 0x00100000, dsflg2 0x00004000, lobid 00000000:00000000, cnt 0, addr 0x00007FFD55D1C014 dx 0x0000000000000000
env [0x0000000017178C7C]: (scn: 0x0000.54290647 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00
statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000.00000000
hi-scn: 0x0000.00000000 ma-scn: 0x0000.00000000 flg: 0x00000660)
kcb_dw_scan_dumpctx: not in DW scan
kdsgrp1_dump database not fully open
*** EXIT: kds state dump ***
----- End of Customized Incident Dump(s) -----
[TOC00003-END]
通过对相关rdba进行dump分析,确认对象id为64和trace中报的信息匹配
DUL> rdba 0x0043b1a5
rdba : 0x0043b1a5=4436389
rfile# : 1
block# : 242085
DUL> dump datafile 1 block 242085 header
Block Header:
block type=0x06 (table/index/cluster segment data block)
block format=0xa2 (oracle 10)
block rdba=0x0043b1a5 (file#=1, block#=242085)
scn=0x0000.438d4a86, seq=1, tail=0x4a860601
block checksum value=0xd591=54673, flag=6
Data Block Header Dump:
Object id on Block? Y
seg/obj: 0x40=64 csc: 0x00.438d4a80 itc: 2 flg: - typ: 1 (data)
fsl: 0 fnx: 0x0 ver: 0x01
Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x0002.01f.00014b92 0x00c01897.6e20.07 C--- 0 scn 0x0000.438c5fca
0x02 0x000a.01a.0011bb8e 0x00c0292c.0317.42 --U- 22 fsc 0x0000.438d4a86
Data Block Dump:
================
flag=0x0 --------
ntab=2
nrow=41
frre=23
fsbo=0x68
ffeo=0xb90
avsp=0x1ce1
tosp=0x1ce1
进一步分析该id为什么对象,使用dul unload obj$

确认对对象为cluster C_OBJ#_INTCOL#,对应的表为HISTGRM$(统计信息中存储直方图信息表),明白这一些,处理起来就比较容易了,open数据库过程中绕过该对象访问,然后对该表进行处理即可