联系:手机/微信(+86 17813235971) QQ(107644445)
标题:ORA-600 kcrfr_update_nab_2 故障恢复
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
由于控制器掉线导致数据库启动报ORA-600 kcrfr_update_nab_2错误,导致无法正常open
数据库版本信息
ORACLE V10.2.0.4.0 - 64bit Production vsnsta=0 vsnsql=14 vsnxtr=3 Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options Windows Server 2003 Version V5.2 Service Pack 2 CPU : 12 - type 8664, 2 Physical Cores Process Affinity : 0x0000000000000000 Memory (Avail/Total): Ph:22579M/32754M, Ph+PgF:24594M/33845M
ORA-600 kcrfr_update_nab_2报错
Mon Oct 24 17:42:57 2016 Database mounted in Exclusive Mode Completed: ALTER DATABASE MOUNT Mon Oct 24 17:42:58 2016 ALTER DATABASE OPEN Mon Oct 24 17:43:14 2016 Beginning crash recovery of 1 threads parallel recovery started with 11 processes Mon Oct 24 17:43:14 2016 Started redo scan Mon Oct 24 17:43:16 2016 Errors in file d:\oracle\product\10.2.0\admin\spcsjkdb\udump\spcsjkdb_ora_10108.trc: ORA-00600: internal error code, arguments: [kcrfr_update_nab_2], [0x7FFC22A2150], [2], [], [], [], [], [] Mon Oct 24 17:43:18 2016 Aborting crash recovery due to error 600 Mon Oct 24 17:43:18 2016 Errors in file d:\oracle\product\10.2.0\admin\spcsjkdb\udump\spcsjkdb_ora_10108.trc: ORA-00600: internal error code, arguments: [kcrfr_update_nab_2], [0x7FFC22A2150], [2], [], [], [], [], [] ORA-600 signalled during: ALTER DATABASE OPEN...
trace文件信息
*** 2016-10-24 17:43:14.515 *** ACTION NAME:() 2016-10-24 17:43:14.515 *** MODULE NAME:(sqlplus.exe) 2016-10-24 17:43:14.515 *** SERVICE NAME:() 2016-10-24 17:43:14.515 *** SESSION ID:(356.3) 2016-10-24 17:43:14.515 Successfully allocated 11 recovery slaves Using 101 overflow buffers per recovery slave Thread 1 checkpoint: logseq 33251, block 2, scn 14624215134369 cache-low rba: logseq 33251, block 2463324 on-disk rba: logseq 33251, block 2803965, scn 14624216078841 start recovery at logseq 33251, block 2463324, scn 0 *** 2016-10-24 17:43:16.406 ksedmp: internal or fatal error ORA-00600: internal error code, arguments: [kcrfr_update_nab_2], [0x7FFC22A2150], [2], [], [], [], [], [] Current SQL statement for this session: ALTER DATABASE OPEN ----- Call Stack Trace ----- calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- ksedmp+663 CALL??? ksedst+55 003C878B8 000000000 012B863E8 000000000 ksfdmp+19 CALL??? ksedmp+663 000000003 015572A70 007222698 003CACC80 kgerinv+158 CALL??? ksfdmp+19 015572430 000000000 0FFFFFFFF 000000000 kgeasnmierr+62 CALL??? kgerinv+158 000000000 000000000 000000000 004FD788F kcrfr_update_nab+18 CALL??? kgeasnmierr+62 00BDA1170 000000000 000000000 6 000000002 kcrfr_read+1078 CALL??? kcrfr_update_nab+18 007222698 00001E650 015572430 6 0072229B8 kcrfrgv+8134 CALL??? kcrfr_read+1078 000000000 0051525D7 000000000 0051525D7 kcratr1+488 CALL??? kcrfrgv+8134 007222698 000000000 000000000 000000000 kcratr+412 CALL??? kcratr1+488 012B891C8 012B890A4 00727FFB8 00BEA7FF0 kctrec+1910 CALL??? kcratr+412 012B891C8 012B91E18 000000000 012B91E48 kcvcrv+3585 CALL??? kctrec+1910 012B92C58 000000000 00726DF00 00726BDB0 kcfopd+1007 CALL??? kcvcrv+3585 012B93350 000000000 000000000 000000000 adbdrv+55820 CALL??? kcfopd+1007 000000000 000000000 000000000 000000000 opiexe+13897 CALL??? adbdrv+55820 000000023 000000003 000000102 000000000 opiosq0+3558 CALL??? opiexe+13897 000000004 000000000 012B9B238 4155474E414C5F45 kpooprx+339 CALL??? opiosq0+3558 000000003 00000000E 012B9B3C8 0000000A4 kpoal8+894 CALL??? kpooprx+339 015587550 000000018 0041AE700 000000001 opiodr+1136 CALL??? kpoal8+894 00000005E 000000017 012B9E868 0072F5100 ttcpip+5146 CALL??? opiodr+1136 00000005E 000000017 012B9E868 2D8C00000000 opitsk+1818 CALL??? ttcpip+5146 015587550 000000000 000000000 000000000 opiino+1129 CALL??? opitsk+1818 00000001E 000000000 000000000 000000000 opiodr+1136 CALL??? opiino+1129 00000003C 000000004 012B9FB20 000000000 opidrv+815 CALL??? opiodr+1136 00000003C 000000004 012B9FB20 000000000 sou2o+52 CALL??? opidrv+815 00000003C 000000004 012B9FB20 7FF7FC48580 opimai_real+131 CALL??? sou2o+52 000000000 012B9FC40 7FFFFF7F258 077EF4D1C opimai+96 CALL??? opimai_real+131 7FF7FC48580 7FFFFF7E000 0001F0003 000000000 OracleThreadStart+6 CALL??? opimai+96 012B9FEF0 01289FF3C 012B9FCC0 40 7FF7FC48580 0000000077D6B6DA CALL??? OracleThreadStart+6 01289FF3C 000000000 000000000 40 012B9FFA8
官方描述
The assert ORA-600: [kcrfr_update_nab_2] is a direct result of a lost write in the current on line log that we are attempting to resolve.So, this confirms the theory that this is a OS/hardware lost write issue not an internal oracle bug. In fact the assert ORA-600: [kcrfr_update_nab_2] is how we detect a lost log write.
Bug 5692594
Hdr: 5692594 10.2.0.1 RDBMS 10.2.0.1 RECOVERY PRODID-5 PORTID-226 ORA-600
Abstract: AFTER DATABASE CRASHED DOESN’T OPEN ORA-600 [KCRFR_UPDATE_NAB_2]
Status: 95,Closed, Vendor OS Problem
Bug 6655116
Hdr: 6655116 10.2.0.3 RDBMS 10.2.0.3 RECOVERY PRODID-5 PORTID-23
Abstract: INSTANCES CRASH WITH ORA-600 [KCRFR_UPDATE_NAB_2] AFTER DISK FAILURE
根据官方的描述,结合故障情况,基本上可以确定是由于硬件异常导致Oracle写丢失,从而除非oracle相关bug导致数据库无法正常启动
ORA-600 [kcrfr_update_nab_2] [a] [b] VERSIONS: versions 10.2 to 11.1 DESCRIPTION: Failure of upgrade of recovery node (RN) enqueue to SSX mode ARGUMENTS: Arg [a] State Object for redo nab enqueue for resilvering Arg [b] Redo nab enqueue mode FUNCTIONALITY: Kernel Cache Redo File Read IMPACT: INSTANCE FAILURE
处理方法
1.如果有备份,利用备份进行不完全恢复,跳过最后异常的redo,数据库resetlogs打开
2.如果没有备份,尝试使用历史的控制文件进行不完全恢复,或者直接跳过数据库一致性打开库.
3.互联网有人解决删除redo第二组成员数据库open成功(http://blog.itpub.net/16976507/viewspace-1266952/)