ORA-01172 ORA-01151 故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-01172 ORA-01151 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户存储异常断电,导致数据库启动报ORA-01172错,导致数据库无法open
数据库启动报ORA-01172错误

Wed Mar 23 14:16:23 2016
ALTER DATABASE OPEN
Wed Mar 23 14:16:24 2016
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Wed Mar 23 14:16:24 2016
Started redo scan
Wed Mar 23 14:16:25 2016
Completed redo scan
 62588 redo blocks read, 15 data blocks need recovery
Wed Mar 23 14:16:25 2016
Started redo application at
 Thread 1: logseq 15050, block 2, scn 2439828667
Wed Mar 23 14:16:25 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 15050 Reading mem 0
  Mem# 0 errs 0: /oracle/oradata/orcl/redo01.log
Wed Mar 23 14:16:25 2016
Completed redo application
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 26185 OF FILE 3
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 69385 OF FILE 3
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 566 OF FILE 2
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 89 OF FILE 2
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 53769 OF FILE 3
Wed Mar 23 14:16:26 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p012_6540.trc:
ORA-01172: recovery of thread 1 stuck at block 566 of file 2
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:26 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p008_6532.trc:
ORA-01172: recovery of thread 1 stuck at block 53769 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:26 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p011_6538.trc:
ORA-01172: recovery of thread 1 stuck at block 69385 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:26 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p005_6526.trc:
ORA-01172: recovery of thread 1 stuck at block 26185 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:27 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p014_6544.trc:
ORA-01172: recovery of thread 1 stuck at block 89 of file 2
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:27 2016
Aborting crash recovery due to slave death, attempting serial crash recovery
Wed Mar 23 14:16:27 2016
Beginning crash recovery of 1 threads
Wed Mar 23 14:16:27 2016
Started redo scan
Wed Mar 23 14:16:27 2016
Completed redo scan
 62588 redo blocks read, 15 data blocks need recovery
Wed Mar 23 14:16:27 2016
Started redo application at
 Thread 1: logseq 15050, block 2, scn 2439828667
Wed Mar 23 14:16:27 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 15050 Reading mem 0
  Mem# 0 errs 0: /oracle/oradata/orcl/redo01.log
RECOVERY OF THREAD 1 STUCK AT BLOCK 566 OF FILE 2
Wed Mar 23 14:16:27 2016
Aborting crash recovery due to error 1172
Wed Mar 23 14:16:27 2016
Errors in file /oracle/admin/orcl/udump/orcl_ora_6514.trc:
ORA-01172: recovery of thread 1 stuck at block 566 of file 2
ORA-01151: use media recovery to recover block, restore backup if needed
ORA-1172 signalled during: ALTER DATABASE OPEN...

ALTER DATABASE RECOVER datafile 1 报错
尝试recover datafile 1之后报ORA-600 kcbrapply_4,ORA-600 kcfrbd_3,ORA-600 kcbrapply_12等错误,从报错信息看,出现这些错误的原因,是由于断电导致坏块引起.

Thu Mar 24 21:50:18 2016
ALTER DATABASE RECOVER  datafile 1
Thu Mar 24 21:50:18 2016
Media Recovery Start
 parallel recovery started with 15 processes
Thu Mar 24 21:50:18 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 15050 Reading mem 0
  Mem# 0 errs 0: /oracle/oradata/orcl/redo01.log
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p004_13391.trc:
ORA-00600: internal error code, arguments: [kcbrapply_4], [2], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p010_13403.trc:
ORA-00600: internal error code, arguments: [kcbrapply_4], [0], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p000_13383.trc:
ORA-00600: internal error code, arguments: [kcbrapply_4], [0], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p009_13401.trc:
ORA-00600: internal error code, arguments: [kcbrapply_4], [3], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Hex dump of (file 1, block 61562) in trace file /oracle/admin/orcl/bdump/orcl_p001_13385.trc
Corrupt block relative dba: 0x0040f07a (file 1, block 61562)
Bad header found during media recovery
Data in bad block:
 type: 0 format: 0 rdba: 0xf07a0000
 last change scn: 0x916c.dc4b0040 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0xb088
 consistency value in tail: 0x06010fc1
 check value in block header: 0x601
 block checksum disabled
Thu Mar 24 21:50:19 2016
Hex dump of (file 1, block 55706) in trace file /oracle/admin/orcl/bdump/orcl_p014_13411.trc
Corrupt block relative dba: 0x0040d99a (file 1, block 55706)
Bad header found during media recovery
Data in bad block:
 type: 0 format: 0 rdba: 0xd99a0000
 last change scn: 0x916c.e1ad0040 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0xa520
 consistency value in tail: 0x06012222
 check value in block header: 0x601
 block checksum disabled
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p006_13395.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [3342335], [1], [0], [64000], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p003_13389.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [3932159], [1], [0], [64000], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p002_13387.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [2293759], [1], [0], [64000], [], []
Reread of rdba: 0x0040d99a (file 1, block 55706) found valid data
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p014_13411.trc:
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Reread of rdba: 0x0040f07a (file 1, block 61562) found valid data
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13385.trc:
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p014_13411.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9782BF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p006_13395.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9C82BF4] [] []
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [3342335], [1], [0], [64000], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p009_13401.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9A02BF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_4], [3], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p003_13389.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9F02AF4] [] []
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [3932159], [1], [0], [64000], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p004_13391.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xBA182AF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_4], [2], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p010_13403.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xBA402AF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_4], [0], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p000_13383.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9282AF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_4], [0], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13385.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9C82AF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p002_13387.trc:
ORA-10562: Error occurred while applying redo to data block (file# 1, block# 11042)
ORA-10564: tablespace SYSTEM
ORA-01110: data file 1: '/oracle/oradata/orcl/system01.dbf'
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [2293759], [1], [0], [64000], [], []

ALTER DATABASE RECOVER datafile 3 报错
该文件恢复主要报ORA-600 kcbrsearchflist_2,ORA-600 kdxlin:psno out of range,ORA-600 kcbs_dump_adv_state等错误

Thu Mar 24 21:52:04 2016
ALTER DATABASE RECOVER  datafile 3
Thu Mar 24 21:52:04 2016
Media Recovery Start
 parallel recovery started with 15 processes
Thu Mar 24 21:52:04 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 15050 Reading mem 0
  Mem# 0 errs 0: /oracle/oradata/orcl/redo01.log
Thu Mar 24 21:52:05 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p007_13462.trc:
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []
Thu Mar 24 21:52:05 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13450.trc:
ORA-00600: internal error code, arguments: [kcbrsearchflist_2], [], [], [], [], [], [], []
Thu Mar 24 21:52:05 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p007_13462.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9F076F4] [] []
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []
Thu Mar 24 21:52:05 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13450.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9C874F4] [] []
ORA-00600: internal error code, arguments: [kcbrsearchflist_2], [], [], [], [], [], [], []
Thu Mar 24 21:52:06 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p007_13462.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9F066F4] [] []
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9F076F4] [] []
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []
Thu Mar 24 21:52:06 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13450.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9C864F4] [] []
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9C874F4] [] []
ORA-00600: internal error code, arguments: [kcbrsearchflist_2], [], [], [], [], [], [], []

恢复过程

SQL> startup mount
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size          1220432 bytes
Variable Size             369098928 bytes
Database Buffers         1761607680 bytes
Redo Buffers               15556608 bytes
Database mounted.
SQL> select file# from v$datafile;
     FILE#
----------
         1
         2
         3
         4
         5
         6
6 rows selected.
SQL> recover datafile 1;
ORA-03113: end-of-file on communication channel
SQL> startup mount
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size          1220432 bytes
Variable Size             369098928 bytes
Database Buffers         1761607680 bytes
Redo Buffers               15556608 bytes
Database mounted.
SQL> recover datafile 3;
ORA-03113: end-of-file on communication channel
SQL> startup mount
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size          1220432 bytes
Variable Size             369098928 bytes
Database Buffers         1761607680 bytes
Redo Buffers               15556608 bytes
Database mounted.
SQL> recover datafile 5;
Media recovery complete.
SQL> recover datafile 6;
Media recovery complete.
SQL> recover datafile 4;
Media recovery complete.
SQL> recover datafile 2;
Media recovery complete.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [57], [11], [], [], [], [],
[]
SQL> select open_mode from v$database;
OPEN_MODE
----------
READ WRITE

这次运气不错,system坏的是mon_mods$,undo异常可以重建,基本上可以说没有数据丢失,数据库恢复完成.
重要的库,通过open过程报错信息,分析可能的坏块所属对象,然后确定处理方法,以免造成永久性数据块损坏.

Solaris rm datafile recovery—利用句柄误删除数据文件恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:Solaris rm datafile recovery—利用句柄误删除数据文件恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天早上接到有客户恢复请求,他一不小心在solaris系统中使用rm -rf oradata命令把一个分区下面的所有数据文件全部删除了。现在不知道怎么办,请求我们给予支持.上去检查发现比较幸运,数据库没有crash,看来直接考虑使用句柄进行恢复

root@CNISORCLSVR # uname -a
SunOS CNISORCLSVR 5.9 Generic_112233-08 sun4u sparc SUNW,Sun-Fire-880
root@CNISORCLSVR # ps -ef|grep lgwr
  oracle   597     1  0   Mar 05 ?       17:14 ora_lgwr_xifenfei
    root 28069 28043  0 18:51:17 pts/2    0:00 grep lgwr
root@CNISORCLSVR # ls -ltr
total 189348454
-r--r--r--   1 oracle   dba       657920 Apr 26  2002 12
c---------   1 root     sys       13, 12 Mar 27  2004 8
c---------   1 root     sys       13, 12 Mar 27  2004 10
-rw-r-----   0 oracle   dba      34359730176 Nov 12  2013 291
-rw-r-----   0 oracle   dba      1073750016 Nov 13  2013 293
D---------   1 root     root           0 Mar  5 19:31 11
-rw-r-----   1 oracle   dba         1758 Mar  5 22:04 9
-rw-rw----   1 oracle   dba           24 Mar  5 22:04 13
s---------   0 root     root           0 Mar  8 00:45 14
-rw-r-----   1 oracle   dba      1887444992 Mar 12 03:27 289
-rw-r-----   1 oracle   dba      943726592 Mar 12 11:17 290
-rw-r-----   0 oracle   dba      4294975488 Mar 13 00:09 292
-rw-r-----   0 oracle   dba      268443648 Mar 13 01:33 288
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 279
-rw-r-----   1 oracle   dba      134225920 Mar 13 01:33 278
-rw-r-----   0 oracle   dba      134225920 Mar 13 01:33 269
-rw-r-----   1 oracle   dba      268443648 Mar 13 01:33 267
-rw-r-----   1 oracle   dba      148119552 Mar 13 01:33 266
-rw-r-----   1 oracle   dba      10493952 Mar 13 01:33 265
-rw-r-----   1 oracle   dba      26222592 Mar 13 01:33 264
-rw-r-----   1 oracle   dba      62922752 Mar 13 01:33 263
-rw-r-----   1 oracle   dba      20979712 Mar 13 01:33 262
-rw-r-----   0 oracle   dba      134225920 Mar 13 01:33 287
-rw-r-----   1 oracle   dba      209723392 Mar 13 01:33 285
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 283
-rw-r-----   1 oracle   dba      67117056 Mar 13 01:33 282
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 281
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 280
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 276
-rw-r-----   0 oracle   dba      1073750016 Mar 13 01:33 275
-rw-r-----   0 oracle   dba      2214600704 Mar 13 01:33 274
-rw-r-----   0 oracle   dba      134225920 Mar 13 01:33 273
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 272
c---------   1 root     sys       13,  2 Mar 13 02:00 5
c---------   1 root     sys       13,  2 Mar 13 02:00 4
c---------   1 root     sys       13,  2 Mar 13 02:00 3
c---------   1 root     sys       13,  2 Mar 13 02:00 2
c---------   1 root     sys       13,  2 Mar 13 02:00 1
c---------   1 root     sys       13,  2 Mar 13 02:00 0
--w-------   1 oracle   dba      4640842 Mar 13 04:43 7
--w-------   1 oracle   dba      4640842 Mar 13 04:43 6
-rw-r-----   0 oracle   dba      1207967744 Mar 13 18:21 271
-rw-r-----   0 oracle   dba      15929974784 Mar 13 18:39 284
-rw-r-----   0 oracle   dba      134225920 Mar 13 18:45 277
-rw-r-----   0 oracle   dba      2122326016 Mar 13 18:46 286
-rw-r-----   0 oracle   dba      9261031424 Mar 13 18:47 270
-rw-r-----   0 oracle   dba      18253619200 Mar 13 18:47 268
-rw-r-----   1 oracle   dba      134225920 Mar 13 18:51 261
-rw-r-----   1 oracle   dba      524296192 Mar 13 18:51 260
-rw-r-----   1 oracle   dba      104858112 Mar 13 18:52 259
-rw-r-----   1 oracle   dba      1941504 Mar 13 18:52 258
-rw-r-----   1 oracle   dba      1941504 Mar 13 18:52 257
-rw-r-----   1 oracle   dba      1941504 Mar 13 18:52 256
SQL> select file#,name from v$datafile wehre name like '/disk%';
     FILE# NAME
---------- --------------------------------------------------
         9 /disk/oradata/xifenfei/xifenfei.dbf
        10 /disk/oradata/xifenfei/CSSN.dbf
        11 /disk/oradata/xifenfei/NCSSN.dbf
        12 /disk/oradata/xifenfei/CSIC_RDS.dbf
        13 /disk/oradata/xifenfei/CSIC_CSSN.dbf
        14 /disk/oradata/xifenfei/CNIS_I.dbf
        15 /disk/oradata/xifenfei/CNIS.dbf
        16 /disk/oradata/xifenfei/TRSWCM6_CSSN.dbf
        17 /disk/oradata/xifenfei/TRSWCM6_CSSN_PLUGINS.dbf
        18 /disk/oradata/xifenfei/DIGIREF.dbf
        20 /disk/oradata/xifenfei/TRSWCM.dbf
        21 /disk/oradata/xifenfei/TRSWCM52_NSLC.dbf
        22 /disk/oradata/xifenfei/TRSWCM52_PLUGINS_NSLC.dbf
        24 /disk/oradata/xifenfei/TRSWCM_PLUGINS.dbf
        25 /disk/oradata/xifenfei/CNIS_ALL.dbf
        27 /disk/oradata/xifenfei/undotbs01.dbf
        28 /disk/oradata/xifenfei/TRS_IDS02.dbf
        29 /disk/oradata/xifenfei/xdb02.dbf

在solaris中比较郁闷,虽然进入了fd目录,但是无法知道哪些文件句柄是删除,哪些是正常的,因此没有办法,只能使用lsof进一步分析

root@CNISORCLSVR # pkgadd -d lsof-4.80-sol9-sparc-local
The following packages are available:
  1  IBMlsof     lsof
                 (sparc) 4.80
Select package(s) you wish to process (or 'all' to process
all packages). (default: all) [?,??,q]: all
Processing package instance <IBMlsof> from </tmp/lsof-4.80-sol9-sparc-local>
lsof
(sparc) 4.80
Vic Abell
Using </usr/local> as the package base directory.
## Processing package information.
## Processing system information.
## Verifying disk space requirements.
## Checking for conflicts with packages already installed.
The following files are already installed on the system and are being
used by another package:
* /usr/local/bin <attribute change only>
* - conflict with a file which does not belong to any package.
Do you want to install these conflicting files [y,n,?,q] y
## Checking for setuid/setgid programs.
The following files are being installed with setuid and/or setgid
permissions:
 /usr/local/bin/lsof <setgid sys>
 /usr/local/bin/sparcv7/lsof <setgid sys>
 /usr/local/bin/sparcv9/lsof <setgid sys>
Do you want to install these as setuid/setgid files [y,n,?,q] y
Installing lsof as <IBMlsof>
## Installing part 1 of 1.
/usr/local/bin/lsof
/usr/local/bin/sparcv7/lsof
/usr/local/bin/sparcv9/lsof
/usr/local/doc/lsof/00.README.FIRST
/usr/local/doc/lsof/00CREDITS
/usr/local/doc/lsof/00DCACHE
/usr/local/doc/lsof/00DIALECTS
/usr/local/doc/lsof/00DIST
/usr/local/doc/lsof/00FAQ
/usr/local/doc/lsof/00LSOF-L
/usr/local/doc/lsof/00MANIFEST
/usr/local/doc/lsof/00PORTING
/usr/local/doc/lsof/00QUICKSTART
/usr/local/doc/lsof/00README
/usr/local/doc/lsof/00TEST
/usr/local/doc/lsof/00XCONFIG
/usr/local/doc/lsof/lsof.man
/usr/local/man/man8/lsof.8
[ verifying class <none> ]
Installation of <IBMlsof> was successful.
root@CNISORCLSVR # ./lsof -p 597
COMMAND PID   USER   FD   TYPE        DEVICE    SIZE/OFF   NODE NAME
oracle  597 oracle  cwd   VDIR          85,5        2048 106299 /export/home/oracle/app/product/9.2.0/dbs
oracle  597 oracle  txt   VREG          85,5    61272672   2332 /export/home/oracle/app/product/9.2.0/bin/oracle
…………
oracle  597 oracle  260u  VREG          85,5   524296192 106517 /export/home/oracle/oradata/xifenfei/system01.dbf
oracle  597 oracle  261u  VREG          85,5   134225920 106518 /export/home/oracle/oradata/xifenfei/undotbs01.dbf
…………
oracle  597 oracle  268u  VREG        118,70 18253619200  109 /disk (/dev/dsk/c2t600A0B800029CEFA0000036C491B270Bd0s6)
oracle  597 oracle  269u  VREG        118,70   134225920  110 /disk (/dev/dsk/c2t600A0B800029CEFA0000036C491B270Bd0s6)
oracle  597 oracle  270u  VREG        118,70  9261031424  111 /disk (/dev/dsk/c2t600A0B800029CEFA0000036C491B270Bd0s6)
…………
oracle  597 oracle  293u  VREG        118,70  1073750016   14 /disk (/dev/dsk/c2t600A0B800029CEFA0000036C491B270Bd0s6)

到这一步,基本上定位/disk部分是我们需要恢复的数据,从而可以定位到句柄,然后结合数据文件信息,直接使用cp命令恢复出来文件.然后数据库层面recover并且online.

cd /proc/597/fd
cp 269 /disk/oradata/cnisora2/CSSN.dbf
chown oracle:dba /disk/oradata/xifenfei/CSSN.dbf
SQL> recover datafile 10;
ORA-00283: 恢复会话因错误而取消
ORA-01124: 无法恢复数据文件 10 - 文件在使用中或在恢复中
ORA-01110: 数据文件 10: '/disk/oradata/xifenfei/CSSN.dbf'
SQL> alter database datafile 10 offline;
数据库已更改。
SQL> recover datafile 10;
完成介质恢复。
SQL> alter database datafile 10 online;
数据库已更改。

至此基本上恢复完成,万幸是数据库没有crash,遇到此类问题,千万不要盲目关闭数据库.另外数据库备份重于一切

ORA-10562 故障恢复—allow 1 corruption

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-10562 故障恢复—allow 1 corruption

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

朋友数据库由于存储变动,导致数据库瞬间hang住,然后直接crash,之后无法正常启动,请求技术支持.
数据库报ORA-00600[2131]错误
不能mount,可以通过重建控制文件解决

Mon Nov 30 20:35:38 2015
alter database mount
Mon Nov 30 20:35:38 2015
NOTE: Loaded library: System
Mon Nov 30 20:35:38 2015
SUCCESS: diskgroup DATADG was mounted
Mon Nov 30 20:35:38 2015
NOTE: dependency between database xifenfei and diskgroup resource ora.DATADG.dg is established
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_26450.trc  (incident=3032256):
ORA-00600: internal error code, arguments: [2131], [33], [32], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: alter database mount...

尝试recover数据库

Mon Nov 30 20:45:53 2015
ALTER DATABASE RECOVER  database
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 80 slaves
Mon Nov 30 20:45:56 2015
Recovery of Online Redo Log: Thread 2 Group 11 Seq 617 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo011.log
Recovery of Online Redo Log: Thread 1 Group 4 Seq 5410 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo04.log
Recovery of Online Redo Log: Thread 1 Group 5 Seq 5411 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo05.log
Mon Nov 30 20:46:07 2015
Recovery of Online Redo Log: Thread 1 Group 6 Seq 5412 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo06.log
Mon Nov 30 20:46:13 2015
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xC] [PC:0x95FB502, kdxlin()+4088] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr13_30480.trc  (incident=3032568):
ORA-07445: 出现异常错误: 核心转储 [kdxlin()+4088] [SIGSEGV] [ADDR:0xC] [PC:0x95FB502] [Address not mapped to object] []
Mon Nov 30 20:46:17 2015
Sweep [inc][3032568]: completed
Sweep [inc2][3032568]: completed
Mon Nov 30 20:46:31 2015
Slave exiting with ORA-10562 exception
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr13_30480.trc:
ORA-10562: Error occurred while applying redo to data block (file# 2, block# 165054)
ORA-10564: tablespace SYSAUX
ORA-01110: 数据文件 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 271
ORA-00607: 当更改数据块时出现内部错误
ORA-00602: 内部编程异常错误
ORA-07445: 出现异常错误: 核心转储 [kdxlin()+4088] [SIGSEGV] [ADDR:0xC] [PC:0x95FB502] [Address not mapped to object] []
Mon Nov 30 20:46:31 2015
Recovery Slave PR13 previously exited with exception 10562
Mon Nov 30 20:46:33 2015
Checker run found 28 new persistent data failures
Mon Nov 30 20:46:35 2015
Media Recovery failed with error 448
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr00_30400.trc:
ORA-00283: 恢复会话因错误而取消
ORA-00448: 后台进程正常结束
ORA-10562 signalled during: ALTER DATABASE RECOVER  database  ...

通过这里可以看到,由于在recover 操作之时,由于某种原因redo的数据无法apply到file 2 block 165054中,导致数据库recover database失败.

按照数据文件recover操作

SQL> recover datafile 1;
Media recovery complete.
SQL> recover datafile 3,4,5,6,7;
Media recovery complete.
SQL> recover datafile 8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28;
Media recovery complete.
SQL> recover datafile 2;
ORA-00283: recovery session canceled due to errors
ORA-10562: Error occurred while applying redo to data block (file# 2, block#
165054)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 271
ORA-00607: Internal error occurred while making a change to a data block
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [kdxlin()+4088] [SIGSEGV]
[ADDR:0xC] [PC:0x95FB502] [Address not mapped to object] []

错误提示和recover database一样,那我们只能让恢复跳过该block继续恢复,因为根据经验判断data object# 271不是系统核心对象,不会影响数据库的启动

跳过坏块继续恢复

SQL> recover  datafile 2 allow 1 corruption;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [2], [69793], [8458401], [],
[], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 69793, file
offset is 571744256 bytes)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 272
SQL> recover  datafile 2 allow 1 corruption;
Media recovery complete.
SQL> alter database open;
Database altered.

出现了ORA-600[3020] 继续跳过坏块,然后数据库顺利open,别忘记加tempfile

处理异常对象

SQL> select object_name,object_type from dba_objects where data_object_id in(272,271);
OBJECT_NAME
--------------------------------------------------------------------------------
OBJECT_TYPE
-------------------
SMON_SCN_TIME_TIM_IDX
INDEX
SMON_SCN_TIME_SCN_IDX
INDEX
SQL> select index_name from dba_indexes where table_name='SMON_SCN_TIME';
INDEX_NAME
------------------------------
SMON_SCN_TIME_TIM_IDX
SMON_SCN_TIME_SCN_IDX
SQL> set pages 1000
SQL> set long 1000
SQL> Select dbms_metadata.get_ddl('TABLE','SMON_SCN_TIME','SYS') FROM DUAL ;
DBMS_METADATA.GET_DDL('TABLE','SMON_SCN_TIME','SYS')
--------------------------------------------------------------------------------
  CREATE TABLE "SYS"."SMON_SCN_TIME"
   (    "THREAD" NUMBER,
        "TIME_MP" NUMBER,
        "TIME_DP" DATE,
        "SCN_WRP" NUMBER,
        "SCN_BAS" NUMBER,
        "NUM_MAPPINGS" NUMBER,
        "TIM_SCN_MAP" RAW(1200),
        "SCN" NUMBER DEFAULT 0,
        "ORIG_THREAD" NUMBER DEFAULT 0           /* for downgrade */
   ) CLUSTER "SYS"."SMON_SCN_TO_TIME_AUX" ("THREAD")
SQL> analyze table smon_scn_time validate structure cascade online;
analyze table smon_scn_time validate structure cascade online
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 2, block # 165054)
ORA-01110: data file 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
SQL> truncate CLUSTER "SYS"."SMON_SCN_TO_TIME_AUX";
Cluster truncated.

关于SMON_SCN_TIME部分处理,可以参考:关于SMON_SCN_TIME若干问题说明.至此数据库基本上恢复完成,而且运气非常好,恢复的非常完美,数据实现0丢失.

ORA-01122 ORA-01210 故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-01122 ORA-01210 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友数据文件头出现错误ORA-01122和ORA-01210等错误,数据库无法正常open。
ORA-01210


因为平台是win,他们找我咨询win bbed,因为回老家电脑没有带,无法提供win的bbed.我通过dd部分文件头,然后在linux平台分析发现是该文件的文件头block大量坏块

bbed分析坏块情况

BBED> show all
        FILE#           0
        BLOCK#          1
        OFFSET          0
        DBA             0x00000000 (0 0,1)
        FILENAME        /tmp/30.dbf
        BIFILE          bifile.bbd
        LISTFILE
        BLOCKSIZE       8192
        MODE            Browse
        EDIT            Unrecoverable
        IBASE           Dec
        OBASE           Dec
        WIDTH           80
        COUNT           512
        LOGFILE         log.bbd
        SPOOL           No
BBED> set count 64
        COUNT           64
BBED> map
 File: /tmp/30.dbf (0)
 Block: 1                                     Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (27)
BBED> d
 File: /tmp/30.dbf (0)
 Block: 1                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 03004400 bffd8a1d 0000000c acba0000 008f4500 00003455 fc020000
 02040000 00000000 00008001 04000000 00000000 00000000 949400b4 94514005
 <32 bytes per line>
BBED> set block +1
        BLOCK#          2
BBED> map
 File: /tmp/30.dbf (0)
 Block: 2                                     Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (27)
BBED> d
 File: /tmp/30.dbf (0)
 Block: 2                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 04004400 bffd8a1d 0000000c a6e00000 008f4500 00003455 fc020000
 0204e81f 00000000 0000241e 05000000 00000000 00000000 11fc297f b426fe2b
 <32 bytes per line>
BBED> set block +1
        BLOCK#          3
BBED> d
 File: /tmp/30.dbf (0)
 Block: 3                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 05004400 bffd8a1d 0000000c 780a0000 008f4500 00003455 fc020000
 0204e81f 00000000 0000c001 06000000 00000000 00000000 2969a0d2 d30168a2
 <32 bytes per line>
BBED> set block +1
        BLOCK#          4
BBED> d
 File: /tmp/30.dbf (0)
 Block: 4                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 06004400 bffd8a1d 0000000c 6c5a0000 008f4500 00003455 fc020000
 0204e81f 00000000 0000f81d 07000000 00000000 00000000 7b51d409 6dc7ca4d
 <32 bytes per line>
BBED> set block +1
        BLOCK#          5
BBED> d
 File: /tmp/30.dbf (0)
 Block: 5                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 07004400 bffd8a1d 0000000c c5600000 008f4500 00003455 fc020000
 02040000 00000000 0000c001 08000000 00000000 00000000 14514005 25145200
 <32 bytes per line>
BBED> set block +1
        BLOCK#          6
BBED> d
 File: /tmp/30.dbf (0)
 Block: 6                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 08004400 bffd8a1d 0000000c 60480000 008f4500 00003455 fc020000
 0204e81f 00000000 0000c301 09000000 00000000 00000000 c2a1606a 7615130a
 <32 bytes per line>
BBED> set block +1
        BLOCK#          7
BBED> d
 File: /tmp/30.dbf (0)
 Block: 7                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 09004400 bffd8a1d 0000000c e3430000 008f4500 00003455 fc020000
 0204e81f 00000000 00000002 0a000000 00000000 00000000 00a28a28 00a28a28
 <32 bytes per line>
BBED> set block +1
        BLOCK#          8
BBED> d
 File: /tmp/30.dbf (0)
 Block: 8                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 0a004400 07fe8a1d 0000000c fc000000 008f4500 00003455 fc020000
 0205e81f 00000000 0000f41d 00000000 00000000 00000000 ffd8ffe0 00104a46
 <32 bytes per line>
BBED> set block +1
        BLOCK#          9
BBED> d
 File: /tmp/30.dbf (0)
 Block: 9                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 0b004400 07fe8a1d 0000000c 48da0000 008f4500 00003455 fc020000
 0205e81f 00000000 0000c601 01000000 00000000 00000000 b47d69d3 7fa96a6f
 <32 bytes per line>
BBED> set block +1
        BLOCK#          10
BBED> d
 File: /tmp/30.dbf (0)
 Block: 10               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 0c004400 07fe8a1d 0000000c be0f0000 008f4500 00003455 fc020000
 0205e81f 00000000 0000181d 02000000 00000000 00000000 9de3e868 4782d83a
 <32 bytes per line>
BBED> set block +1
        BLOCK#          11
BBED> d
 File: /tmp/30.dbf (0)
 Block: 11               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 0d004400 07fe8a1d 0000000c 9cd00000 008f4500 00003455 fc020000
 0205e81f 00000000 0000241e 03000000 00000000 00000000 dead1259 5919e385
 <32 bytes per line>
BBED> set block +1
        BLOCK#          12
BBED> d
 File: /tmp/30.dbf (0)
 Block: 12               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 0e004400 07fe8a1d 0000000c df450000 008f4500 00003455 fc020000
 0205e81f 00000000 00004001 04000000 00000000 00000000 31d9a292 9698828a
 <32 bytes per line>
BBED> set block +1
        BLOCK#          13
BBED> d
 File: /tmp/30.dbf (0)
 Block: 13               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 0f004400 07fe8a1d 0000000c 18790000 008f4500 00003455 fc020000
 02050000 00000000 00000002 05000000 00000000 00000000 b93f8235 5ea063b7
 <32 bytes per line>

拿block 1的rdba(04004400–倒序存储)分析[win文件拷贝到linux后使用bbed查看相差1 block]可以的出来block信息为file=1, block=262148,明显错误.

通过dul分析文件头损坏情况

Data UnLoader: 10.2.0.6.9 - Internal Only - on Tue Sep 29 22:15:22 2015
with 64-bit io functions
Copyright (c) 1994 2015 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
DUL: Warning: Recreating file "dul.log"
Reading SCANNEDLOBPAGE.dat 1204 entries loaded and sorted 1204 entries
Reading SEG.dat 0 entries loaded
Reading EXT.dat 44 entries loaded and sorted 44 entries
Reading COMPATSEG.dat 0 entries loaded
DUL: Warning: Wrong DBA  0X00440004 (file=1, block=262148) (Ignored)
DUL: Error: While processing file# 30 block# 1
DUL: Warning: Found mismatch while checking file E:\TEMP\shebao\30.dbf
DUL: Warning: DUL osd_parameter or control.dul configuration error
DUL: Warning: Given file number(30) in control file does not match file# in dba(1)
DUL: Warning: Wrong DBA  0X00440004 (file=1, block=262148) (Ignored)
DUL: Error: While processing file# 30 block# 1
DUL>

通过bbed和dul证明文件头大量损坏,而且尚未有任何该文件的物理备份,因此恢复起来难道较大。

分析Oracle Database Recovery Check Result
通过对Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)的分析结果,我们意外的发现,人品不错,发现异常的文件创建时间为2015-09-26 19:39:33,进一步和客户沟通,这个文件存储为图片,少量丢失可以允许,优先恢复业务
Oracle-Database-Recovery-Check


有了这个结论,那处理起来就easy了,直接offline异常文件,然后分析丢失的表
从而确定时lob字典的少量extent数据分配到了file 30上
lob


为了避免查询对应lob之时出现错误,通过update 对应lob为空规避该问题

create table corrupt_lobs (corrupt_rowid rowid,table_name varchar2(100));
declare
  n number;
begin
  for cursor_lob in (select rowid r, xff_lob from xff.t_xifenfei) loop
  begin
    n:=dbms_lob.instr(cursor_lob.xff_lob,hextoraw('889911'));
  exception
    when others then
      insert into corrupt_lobs values (cursor_lob.r,'xff.t_xifenfei');
      commit;
    end;
  end loop;
end;
/
update xff.t_xifenfei
     set xff_lob = empty_blob()
     where rowid in (select corrupted_rowid from corrupt_lobs);

本次恢复是由于运气好,遇到异常文件刚好是最近加入,而且都是图片,客户允许少量丢失,如果是不允许丢失的数据文件,可能需要通过找历史的该文件的备份(Oracle 12C的第一次异常恢复—文件头坏块),在某些情况下,如果也没有此类备份,只能通过bbed重构block 1(如果有其他异常块一次处理,如果太多无法处理,最少也需要重构block 1),然后尝试open数据库或者使用dul之类工具处理(因为文件头损坏,工具可能不能识别文件无法恢复)

ORA-600 k2vcbk_2 故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600 k2vcbk_2 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友找到我说他们数据库无法启动,数据库启动报ORA-600[k2vcbk_2]错误,数据库版本为11.2.0.2 RAC,操作系统是AIX 6.1

SQL> recover database;
Media recovery complete.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [],
[], [], [], [], []
Process ID: 7930020
Session ID: 49 Serial number: 14761

数据库节点1日志

Mon Sep 21 15:45:41 2015
Thread 1 advanced to log sequence 54076 (LGWR switch)
  Current log# 13 seq# 54076 mem# 0: +DG01/xifenfei/onlinelog/group_13.332.779459035
  Current log# 13 seq# 54076 mem# 1: +DG01/xifenfei/onlinelog/group_13.344.779582621
Mon Sep 21 15:45:44 2015
Archived Log entry 74655 added for thread 1 sequence 54075 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:56:18 2015
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_18088342.trc  (incident=184348):
ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184348/xifenfei1_ora_18088342_i184348.trc
Mon Sep 21 15:56:34 2015
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Error 600 trapped in 2PC on transaction 7.16.120119. Cleaning up.
Error stack returned to user:
ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_18088342.trc  (incident=184349):
ORA-00603: ORACLE 服务器会话因致命错误而终止
ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
Mon Sep 21 15:56:34 2015
Dumping diagnostic data in directory=[cdmp_20150921155634], requested by (instance=1, osid=18088342), summary=[incident=184348].
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184349/xifenfei1_ora_18088342_i184349.trc
Mon Sep 21 15:56:35 2015
Sweep [inc][184349]: completed
Sweep [inc][184348]: completed
Sweep [inc2][184348]: completed
opiodr aborting process unknown ospid (18088342) as a result of ORA-603
Mon Sep 21 15:57:12 2015
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_smon_7536810.trc  (incident=184274):
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184274/xifenfei1_smon_7536810_i184274.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Sep 21 15:57:16 2015
Dumping diagnostic data in directory=[cdmp_20150921155716], requested by (instance=1, osid=7536810 (SMON)), summary=[incident=184274].
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_smon_7536810.trc:
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 7536810): terminating the instance due to error 474
Mon Sep 21 15:57:18 2015
ORA-1092 : opitsk aborting process

数据库节点2日志

Mon Sep 21 15:21:50 2015
Archived Log entry 74653 added for thread 2 sequence 23559 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:44:28 2015
Thread 2 advanced to log sequence 23561 (LGWR switch)
  Current log# 12 seq# 23561 mem# 0: +DG01/xifenfei/onlinelog/group_12.338.779457003
  Current log# 12 seq# 23561 mem# 1: +DG01/xifenfei/onlinelog/group_12.265.779582493
Mon Sep 21 15:44:31 2015
Archived Log entry 74654 added for thread 2 sequence 23560 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:45:31 2015
DISTRIB TRAN xifenfei.1ebab0a5.20.3.1533822
  is local tran 20.3.1533822 (hex=14.03.17677e)
  insert pending committed tran, scn=14590688068086 (hex=d45.28c781f6)
Mon Sep 21 15:45:31 2015
DISTRIB TRAN xifenfei.1ebab0a5.20.3.1533822
  is local tran 20.3.1533822 (hex=14.03.17677e))
  delete pending committed tran, scn=14590688068086 (hex=d45.28c781f6)
Mon Sep 21 15:56:35 2015
Dumping diagnostic data in directory=[cdmp_20150921155634], requested by (instance=1, osid=18088342), summary=[incident=184348].
Mon Sep 21 15:57:10 2015
Error 3135 trapped in 2PC on transaction 20.11.1534704. Cleaning up.
Error stack returned to user:
ORA-03135: 连接失去联系
opidcl aborting process unknown ospid (9175532) as a result of ORA-604
Mon Sep 21 15:57:17 2015
Dumping diagnostic data in directory=[cdmp_20150921155716], requested by (instance=1, osid=7536810 (SMON)), summary=[incident=184274].
Mon Sep 21 15:57:23 2015
Reconfiguration started (old inc 18, new inc 20)
List of instances:
 2 (myinst: 2)
 Global Resource Directory frozen
 * dead instance detected - domain 0 invalid = TRUE
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Mon Sep 21 15:57:23 2015
 LMS 2: 3 GCS shadows cancelled, 1 closed, 0 Xw survived
Mon Sep 21 15:57:23 2015
 LMS 0: 2 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Sep 21 15:57:23 2015
 LMS 1: 3 GCS shadows cancelled, 1 closed, 0 Xw survived
 Set master node info
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
Mon Sep 21 15:57:23 2015
minact-scn: Inst 2 is now the master inc#:20 mmon proc-id:6816208 status:0x7
minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0d45.28c2bb5c gcalc-scn:0x0d45.28c3bd2e
minact-scn: master found reconf/inst-rec before recscn scan old-inc#:20 new-inc#:20
Mon Sep 21 15:57:23 2015
Instance recovery: looking for dead threads
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
Beginning instance recovery of 1 threads
 parallel recovery started with 31 processes
Started redo scan
Completed redo scan
 read 12626 KB redo, 1724 data blocks need recovery
Started redo application at
 Thread 1: logseq 54076, block 184416
Recovery of Online Redo Log: Thread 1 Group 13 Seq 54076 Reading mem 0
  Mem# 0: +DG01/xifenfei/onlinelog/group_13.332.779459035
  Mem# 1: +DG01/xifenfei/onlinelog/group_13.344.779582621
Completed redo application of 9.78MB
Completed instance recovery at
 Thread 1: logseq 54076, block 209669, scn 14590688357285
 1633 data blocks read, 1794 data blocks written, 12626 redo k-bytes read
Thread 1 advanced to log sequence 54077 (thread recovery)
Mon Sep 21 15:57:33 2015
Error 3113 trapped in 2PC on transaction 21.18.1965522. Cleaning up.
Redo thread 1 internally disabled at seq 54077 (SMON)
Error stack returned to user:
ORA-02050: 事务处理 21.18.1965522 已回退, 某些远程数据库可能有问题
ORA-03113: 通信通道的文件结尾
ORA-02063: 紧接着 line (起自 ZSK)
Mon Sep 21 15:57:34 2015
Archived Log entry 74656 added for thread 1 sequence 54076 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:57:34 2015
ARC0: Archiving disabled thread 1 sequence 54077
Archived Log entry 74657 added for thread 1 sequence 54077 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:57:35 2015
Thread 2 advanced to log sequence 23562 (LGWR switch)
  Current log# 8 seq# 23562 mem# 0: +DG01/xifenfei/onlinelog/group_8.334.779456945
  Current log# 8 seq# 23562 mem# 1: +DG01/xifenfei/onlinelog/group_8.267.779582453
Mon Sep 21 15:57:36 2015
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_smon_6750672.trc  (incident=200218):
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei2/incident/incdir_200218/xifenfei2_smon_6750672_i200218.trc
Archived Log entry 74658 added for thread 2 sequence 23561 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:57:38 2015
minact-scn: master continuing after IR
Mon Sep 21 15:57:41 2015
Dumping diagnostic data in directory=[cdmp_20150921155741], requested by (instance=2, osid=6750672 (SMON)), summary=[incident=200218].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fatal internal error happened while SMON was doing instance transaction recovery.
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_smon_6750672.trc:
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 6750672): terminating the instance due to error 474
Mon Sep 21 15:57:41 2015
ORA-1092 : opitsk aborting process
Mon Sep 21 15:57:42 2015
ORA-1092 : opitsk aborting process
Mon Sep 21 15:57:42 2015
License high water mark = 291
Instance terminated by SMON, pid = 6750672
USER (ospid: 18874814): terminating the instance

通过数据库日志大概可以看出来,由于节点2的分布式事事务异常,而在11.2.0.2中,分布式事务跨节点,引起节点2的pmon清理异常事务,但是由于bug,使得异常事务无法被清理掉,从而引起节点1 crash,节点1 crash之后节点2进行恢复,也因为分布式事务bug,导致smon回滚失败,实例也crash。无法进行回滚导致数据库无法正常启动,通过查询mos发现定位到是Bug 10222544 ORA-600 [k2vpci_2] from multi-branch distributed transaction
ORA-600-k2vpci_2


对于这类问题,由于分布事务无法清理,处理方法就是找出来事务人工提交或者直接屏蔽掉这个事务解决该问题

system01.dbf文件被offline,导致数据库报ORA-01245 ORA-01110故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:system01.dbf文件被offline,导致数据库报ORA-01245 ORA-01110故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友找到我,说数据库做recover报ORA-01245和ORA-01110错误,无法继续恢复,请求支持

SQL> recover database using backup controlfile until cancel;
…………
第 1 行出现错误:
ORA-01245: RESETLOGS 完成时脱机文件 1 将丢失
ORA-01110: 数据文件 1: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\SYSTEM01.DBF'

通过Oracle Database Recovery Check检查数据库情况,发现datafile 1处于offline状态
oracle_recovery_check


Wed Aug 26 23:11:00 2015
alter database datafile 1 offline drop
Completed: alter database datafile 1 offline drop

从这里基本上可以知道为什么出现ORA-01245错误了,由于system表空间中文件被offline导致.

redo信息
oracle_recovery_check_redo

Mon Aug 24 22:38:35 2015
alter database clear unarchived logfile group 2
Clearing online log 2 of thread 1 sequence number 5705
Completed: alter database clear unarchived logfile group 2
Wed Aug 26 23:13:23 2015
alter database clear logfile group 3
Clearing online log 3 of thread 1 sequence number 5706
Completed: alter database clear logfile group 3

除当前redo之外,其他redo被clear

尝试恢复

SQL> alter database datafile 1 online;
数据库已更改。
SQL> recover database;
ORA-00283: 恢复会话因错误而取消
ORA-01610: 使用 BACKUP CONTROLFILE 选项的恢复必须已完成
SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570
5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO03.LOG
ORA-00310: 归档日志包含序列 5706; 要求序列 5705
ORA-00334: 归档日志: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO03.LOG'
SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570
5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO02.LOG
ORA-00339: 归档日志未包含任何重做
ORA-00334: 归档日志: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO02.LOG'
SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570
5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO01.LOG
ORA-00310: 归档日志包含序列 5707; 要求序列 5705
ORA-00334: 归档日志: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO01.LOG'

数据库做恢复需要seq 5705的redo,但是redo已经被clear,导致现在数据库常规手段无法恢复,只用使用隐含参数屏蔽数据库前滚(一致性检查)

再次尝试打开数据库

ORACLE 例程已经启动。
Total System Global Area  778387456 bytes
Fixed Size                  1374808 bytes
Variable Size             486540712 bytes
Database Buffers          285212672 bytes
Redo Buffers                5259264 bytes
数据库装载完毕。
SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570
5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
介质恢复已取消。
SQL> alter database open resetlogs;
数据库已更改。

在数据库恢复中,请不要对system表空间数据文件进行offline操作,如果对此类文件进行offline操作,讲在数据库恢复过程中出现ORA-01245和ORA-01110错误,而且文件还会出现SYSOFF状态

aix平台 ORA-01115 ORA-01110 ORA-27067 故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:aix平台 ORA-01115 ORA-01110 ORA-27067 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到朋友恢复请求,aix 5.3,Oracle 10.2.0.1平台,数据库启动报ORA-01115 ORA-01110 ORA-27067错误,数据库无法正常打开,通过分析,是由于10201在aix上面的bug导致,通过技巧规避,完美解决给问题,数据0丢失
数据库报错alert日志

Mon Aug 10 13:25:22 2015
ALTER DATABASE   MOUNT
Mon Aug 10 13:25:29 2015
Setting recovery target incarnation to 1
Mon Aug 10 13:25:29 2015
Successful mount of redo thread 1, with mount id 432339141
Mon Aug 10 13:25:29 2015
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Mon Aug 10 13:25:36 2015
alter database open
Mon Aug 10 13:25:36 2015
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Mon Aug 10 13:25:37 2015
Started redo scan
Mon Aug 10 13:25:52 2015
Completed redo scan
 7889582 redo blocks read, 75305 data blocks need recovery
Mon Aug 10 13:25:53 2015
Errors in file /dc/admin/datacent/bdump/datacent_p002_144124.trc:
ORA-01115: IO error reading block from file 2 (block # 40704)
ORA-01110: data file 2: '/dc/oradata/datacent/undotbs01.dbf'
ORA-27067: size of I/O buffer is invalid
Additional information: 2
Additional information: 1572864
Mon Aug 10 13:25:53 2015
Aborting crash recovery due to slave death, attempting serial crash recovery
Mon Aug 10 13:25:53 2015
Beginning crash recovery of 1 threads
Mon Aug 10 13:25:53 2015
Started redo scan
Mon Aug 10 13:26:09 2015
Completed redo scan
 7889582 redo blocks read, 75305 data blocks need recovery
Mon Aug 10 13:26:12 2015
Aborting crash recovery due to error 1115
Mon Aug 10 13:26:12 2015
Errors in file /dc/admin/datacent/udump/datacent_ora_123384.trc:
ORA-01115: IO error reading block from file 2 (block # 39077)
ORA-01110: data file 2: '/dc/oradata/datacent/undotbs01.dbf'
ORA-27067: size of I/O buffer is invalid
Additional information: 2
Additional information: 1310720
ORA-1115 signalled during: alter database open...

这里报的前面两个错误ORA-01115 ORA-01110我们都非常熟悉,类似数据库启动遇到坏块或者io错误之时可能就会报如此错误。但是ORA-27067确实不多见,从mos上看,很多是由于rman备份之时的bug可能导致该错误。

dbv检测undo坏块文件

DBVERIFY: Release 10.2.0.1.0 - Production on Mon Aug 10 23:18:15 2015
Copyright (c) 1982, 2003, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /dc/oradata/datacent/undotbs01.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 329600
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 327504
Total Pages Processed (Seg)  : 17
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2096
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1887888 (0.1887888)

这里可以看到,undo文件本身并没有逻辑和物理的坏块,证明因为数据库异常的原因,可能是由于ORA-27067: size of I/O buffer is invalid导致。根据官方文档ORA-01115 ORA-27067 DURING PARALLEL INSTANCE RECOVERY AFTER INSTANCE CRASH中的解释,我们基本上可以确定很可能是由于10.2.0.1在aix平台的jfs2系统中,由于大量事务操作,突然abort掉数据库(也可能断电),从而数据库在启动的时候进行实例恢复,而由于内部的bug,导致实例恢复无法成功。通过我们处理后的,数据库完美启动,数据0丢失

数据库启动日志

Mon Aug 10 16:34:14 2015
alter database open
Mon Aug 10 16:34:14 2015
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Mon Aug 10 16:34:14 2015
Started redo scan
Mon Aug 10 16:34:27 2015
Completed redo scan
 7889582 redo blocks read, 0 data blocks need recovery
Mon Aug 10 16:34:27 2015
Started redo application at
 Thread 1: logseq 664704, block 1286922
Mon Aug 10 16:34:27 2015
Recovery of Online Redo Log: Thread 1 Group 4 Seq 664704 Reading mem 0
  Mem# 0 errs 0: /dev/rredo04
Mon Aug 10 16:34:32 2015
Recovery of Online Redo Log: Thread 1 Group 5 Seq 664705 Reading mem 0
  Mem# 0 errs 0: /dev/rredo05
Mon Aug 10 16:34:38 2015
Recovery of Online Redo Log: Thread 1 Group 6 Seq 664706 Reading mem 0
  Mem# 0 errs 0: /dev/rredo06
Mon Aug 10 16:34:40 2015
Completed redo application
Mon Aug 10 16:34:40 2015
Completed crash recovery at
 Thread 1: logseq 664706, block 1017805, scn 8554793334
 0 data blocks read, 0 data blocks written, 7889582 redo blocks read
Mon Aug 10 16:34:40 2015
Thread 1 advanced to log sequence 664707
Thread 1 opened at log sequence 664707
  Current log# 1 seq# 664707 mem# 0: /dev/rredo01
Successful open of redo thread 1
Mon Aug 10 16:34:40 2015
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Aug 10 16:34:40 2015
SMON: enabling cache recovery
Mon Aug 10 16:34:40 2015
Successfully onlined Undo Tablespace 1.
Mon Aug 10 16:34:40 2015
SMON: enabling tx recovery
Mon Aug 10 16:34:41 2015
Database Characterset is ZHS32GB18030
replication_dependency_tracking turned off (no async multimaster replication found)
WARNING: AQ_TM_PROCESSES is set to 0. System operation might be adversely affected.
Mon Aug 10 16:34:41 2015
SMON: Parallel transaction recovery tried
Mon Aug 10 16:34:42 2015
db_recovery_file_dest_size of 2048 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Mon Aug 10 16:34:42 2015
Completed: alter database open

ORA-600 kccpb_sanity_check_2故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600 kccpb_sanity_check_2故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天是有人在淘宝旺旺上找我,需要oracle数据库恢复支持
wangwang


远程登录上去一看发现数据库mount的时候报ORA-600[kccpb_sanity_check_2]错误

C:\Documents and Settings\Administrator>sqlplus / as sysdba
SQL*Plus: Release 10.2.0.3.0 - Production on Wed Jul 29 16:23:18 2015
Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options
SQL> alter database mount;
alter database mount
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kccpb_sanity_check_2], [14169],
[14160], [0x0], [], [], [], []

尝试重建控制文件

SQL> shutdown immediate;
ORA-01507: database not mounted
ORACLE instance shut down.
SQL> startup pfile='D:\database\m104\pfile\init.ora' nomount
ORACLE instance started.
Total System Global Area  444596224 bytes
Fixed Size                  1291072 bytes
Variable Size             155192512 bytes
Database Buffers          281018368 bytes
Redo Buffers                7094272 bytes
SQL> SHOW PARAMETER CONT;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
control_file_record_keep_time        integer     7
control_files                        string      D:\DATABASE\M104\CTRL\CONTROL0
                                                 2.CTL
global_context_pool_size             string
SQL> ALTER DATABASE MOUNT;
ALTER DATABASE MOUNT
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kccpb_sanity_check_2], [14169],
[14160], [0x0], [], [], [], []
SQL>
SQL> CREATE CONTROLFILE REUSE DATABASE "m104_db" NORESETLOGS  FORCE LOGGING NOAR
CHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 2921
  7  LOGFILE
  8    GROUP 1 'D:\database\m104\log\redo01.log'  SIZE 51200K,
  9    GROUP 2 'D:\database\m104\log\redo02.log'  SIZE 51200K,
 10    GROUP 3 'D:\database\m104\log\redo03.log'  SIZE 51200K
 11  DATAFILE
 12    'd:\database\m104\data\system01.dbf',
 13    'd:\database\m104\data\sysaux01.dbf',
 14    'd:\database\m104\data\USERS01.DBF',
 15    'd:\database\m104\data\UNDOTBS01.DBF',
 16    'd:\database\m104\data\INDX01.DBF'
 17  CHARACTER SET WE8ISO8859P1
 18  ;
CREATE CONTROLFILE REUSE DATABASE "m104_db" NORESETLOGS  FORCE LOGGING NOARCHIVE
LOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-00600: internal error code, arguments: [kccsga_update_ckpt_4], [1], [8],
[], [], [], [], []
SQL>
SQL> CREATE CONTROLFILE REUSE DATABASE "m104_db" RESETLOGS  FORCE LOGGING NOARCH
IVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 2921
  7  LOGFILE
  8    GROUP 1 'D:\database\m104\log\redo01.log'  SIZE 51200K,
  9    GROUP 2 'D:\database\m104\log\redo02.log'  SIZE 51200K,
 10    GROUP 3 'D:\database\m104\log\redo03.log'  SIZE 51200K
 11  DATAFILE
 12    'd:\database\m104\data\system01.dbf',
 13    'd:\database\m104\data\sysaux01.dbf',
 14    'd:\database\m104\data\USERS01.DBF',
 15    'd:\database\m104\data\UNDOTBS01.DBF',
 16    'd:\database\m104\data\INDX01.DBF'
 17  CHARACTER SET WE8ISO8859P1
 18  ;
CREATE CONTROLFILE REUSE DATABASE "m104_db" RESETLOGS  FORCE LOGGING NOARCHIVELO
G
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-00600: internal error code, arguments: [kccsga_update_ckpt_4], [1], [8],
[], [], [], [], []

无论是使用noresetlogs还是resetlogs,重建控制文件都报ORA-600[kccsga_update_ckpt_4]错误.比较奇怪,无解指定控制文件新名称重建试试看

修改控制文件路径

SQL> SHUTDOWN ABORT
ORACLE instance shut down.
SQL> startup pfile='D:\database\m104\pfile\init.ora' nomount
ORACLE instance started.
Total System Global Area  444596224 bytes
Fixed Size                  1291072 bytes
Variable Size             155192512 bytes
Database Buffers          281018368 bytes
Redo Buffers                7094272 bytes
SQL> SHOW PARAMETER CONT;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
control_file_record_keep_time        integer     7
control_files                        string      D:\DATABASE\M104\CTRL\CONTROL0
                                                 4.CTL
global_context_pool_size             string
SQL> CREATE CONTROLFILE REUSE DATABASE "m104_db" RESETLOGS  FORCE LOGGING NOARCH
IVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 2921
  7  LOGFILE
  8    GROUP 1 'D:\database\m104\log\redo01.log'  SIZE 51200K,
  9    GROUP 2 'D:\database\m104\log\redo02.log'  SIZE 51200K,
 10    GROUP 3 'D:\database\m104\log\redo03.log'  SIZE 51200K
 11  DATAFILE
 12    'd:\database\m104\data\system01.dbf',
 13    'd:\database\m104\data\sysaux01.dbf',
 14    'd:\database\m104\data\USERS01.DBF',
 15    'd:\database\m104\data\UNDOTBS01.DBF',
 16    'd:\database\m104\data\INDX01.DBF'
 17  CHARACTER SET WE8ISO8859P1
 18  ;
Control file created.

使用新的控制文件位置,这次终于数据库重建控制文件成功
尝试指定redo进行恢复,数据库正常打开

SQL> RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL;
ORA-00279: change 3643108240801 generated at 07/26/2015 20:15:22 needed for
thread 1
ORA-00289: suggestion :
D:\ORACLE\PRODUCT\10.2.0\DB_1\RDBMS\ARC00567_0866390669.001
ORA-00280: change 3643108240801 for thread 1 is in sequence #567
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
D:\database\m104\log\redo01.log
ORA-00310: archived log contains sequence 566; sequence 567 required
ORA-00334: archived log: 'D:\DATABASE\M104\LOG\REDO01.LOG'
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: 'D:\DATABASE\M104\DATA\SYSTEM01.DBF'
SQL> RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL;
ORA-00279: change 3643108240801 generated at 07/26/2015 20:15:22 needed for
thread 1
ORA-00289: suggestion :
D:\ORACLE\PRODUCT\10.2.0\DB_1\RDBMS\ARC00567_0866390669.001
ORA-00280: change 3643108240801 for thread 1 is in sequence #567
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
D:\database\m104\log\redo02.log
Log applied.
Media recovery complete.
SQL> ALTER DATABASE OPEN RESETLOGS;
Database altered.

数据库恢复完成。整个数据库恢复比较简单,但是注意这里的ORA-600[kccsga_update_ckpt_4]通过修改控制文件路径规避,具体原因待查。

知识点补充:ORA-600 [kccpb_sanity_check_2] [a] [b] {c}

VERSIONS:
  Versions 10.2 to 11.2
DESCRIPTION:
  This internal error is raised when the sequence number (seq#) of the
  current block of the controlfile is greater than the seq# in the controlfile header.
  The header value should always be equal to, or greater than the value
  held in the control file block(s).
  This extra check was introduced in Oracle 10gR2 to detect lost writes
  or stale reads to the header.
ARGUMENTS:
  Arg [a] seq# in control block header.
  Arg [b] seq# in the control file header.
  Arg {c}

Oracle异常恢复前备份保护现场建议—ASM环境

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:Oracle异常恢复前备份保护现场建议—ASM环境

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在上一篇中写道了文件系统的库,在进行异常恢复前的备份方法(Oracle异常恢复前备份保护现场建议—FileSystem环境),对于asm库,因为asm 里面的数据文件无法直接dd文件头,因此备份方式也有所改变.对于asm是mount,但是数据库不能打开,使用rman或者asm的cp命令全部备份数据文件也来不及或者空间不足,这样的情况下,你可以考虑使用rman或者cp命令备份控制文件和system表空间文件,cp命令备份redo,dd命令备份文件头,来完成asm情况下数据库异常恢复前备份

控制文件备份
11.2及其以后版本使用asmcmd cp命令处理

select 'asmcmd cp '||name||' &&backup_dir/' from v$datafile where ts#=0
union all
select 'asmcmd cp '||name||' &&backup_dir/crontrofile_'||rownum||'.ctl' from v$controlfile
union all
select 'asmcmd cp '||member||' &&backup_dir/'||thread#||'_'||a.group#||'_'||sequence#||'_'||substr(member,
instr(member,'/',-1)+1)  FROM v$log a, v$logfile b WHERE a.group# = B.GROUP#;

其他版本使用rman命令处理

--rman备份控制文件(/tmp目录自己修改)
copy current controlfile to '/tmp/ctl.ctl';
--rman备份system表空间
select 'copy datafile '||file#||' to ''&backup_dir/system_'||file#||'.dbf'';'
from v$datafile where ts#=0;
--redo无法直接备份

备份文件头

[grid@xifenfei ~]$ ss
SQL*Plus: Release 11.2.0.4.0 Production on Fri May 1 04:15:18 2015
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Automatic Storage Management option
SQL> set lines 150
SQL> select 'dd if='||c.PATH_KFDSK||' of=&&backup_path/'||a.GROUP_KFFXP||'_'||a.disk_kffxp||'_'||
  2  b.NUMBER_KFFIL||'.asm count=1 bs='|| d.AUSIZE_KFGRP||' skip='||a.au_kffxp backup_dd_cmd
  3   FROM x$kffxp a, X$KFFIL  b,X$KFDSK c,X$KFGRP d  WHERE
  4  a.GROUP_KFFXP=b.GROUP_KFFIL
  5  and a.NUMBER_KFFXP=b.NUMBER_KFFIL
  6  and b.FTYPE_KFFIL in(2,12)
  7  and b.NUMBER_KFFIL>255
  8  and a.xnum_kffxp=0
  9  and a.GROUP_KFFXP=c.GRPNUM_KFDSK
 10  and a.disk_kffxp=c.NUMBER_KFDSK
 11  and a.GROUP_KFFXP=d.NUMBER_KFGRP;
Enter value for backup_path: /tmp
old   1: select 'dd if='||c.PATH_KFDSK||' of=&&backup_path/'||a.GROUP_KFFXP||'_'||a.disk_kffxp||'_'||
new   1: select 'dd if='||c.PATH_KFDSK||' of=/tmp/'||a.GROUP_KFFXP||'_'||a.disk_kffxp||'_'||
BACKUP_DD_CMD
------------------------------------------------------------------------------------------------------------------
dd if=/dev/asm-disk1 of=/tmp/1_0_256.asm count=1 bs=1048576 skip=29
dd if=/dev/asm-disk2 of=/tmp/1_1_257.asm count=1 bs=1048576 skip=404
dd if=/dev/asm-disk2 of=/tmp/1_1_258.asm count=1 bs=1048576 skip=641
dd if=/dev/asm-disk1 of=/tmp/1_0_259.asm count=1 bs=1048576 skip=648
dd if=/dev/asm-disk3 of=/tmp/2_0_256.asm count=1 bs=1048576 skip=51

还原文件头

SQL> set lines 150
SQL> select 'dd of='||c.PATH_KFDSK||' if=&&backup_path/'||a.GROUP_KFFXP||'_'||a.disk_kffxp||
  2  '_'||b.NUMBER_KFFIL||'.asm count=1 conv=notrunc bs='|| d.AUSIZE_KFGRP||' seek='||a.au_kffxp restore_dd_cmd
  3   FROM x$kffxp a, X$KFFIL  b,X$KFDSK c,X$KFGRP d  WHERE
  4  a.GROUP_KFFXP=b.GROUP_KFFIL
  5  and a.NUMBER_KFFXP=b.NUMBER_KFFIL
  6  and b.FTYPE_KFFIL in(2,12)
  7  and b.NUMBER_KFFIL>255
  8  and a.xnum_kffxp=0
  9  and a.GROUP_KFFXP=c.GRPNUM_KFDSK
 10  and a.disk_kffxp=c.NUMBER_KFDSK
 11  and a.GROUP_KFFXP=d.NUMBER_KFGRP;
old   1: select 'dd of='||c.PATH_KFDSK||' if=&&backup_path/'||a.GROUP_KFFXP||'_'||a.disk_kffxp||
new   1: select 'dd of='||c.PATH_KFDSK||' if=/tmp/'||a.GROUP_KFFXP||'_'||a.disk_kffxp||
RESTORE_DD_CMD
-----------------------------------------------------------------------------------------------------------------
dd of=/dev/asm-disk1 if=/tmp/1_0_256.asm count=1 conv=notrunc bs=1048576 seek=29
dd of=/dev/asm-disk2 if=/tmp/1_1_257.asm count=1 conv=notrunc bs=1048576 seek=404
dd of=/dev/asm-disk2 if=/tmp/1_1_258.asm count=1 conv=notrunc bs=1048576 seek=641
dd of=/dev/asm-disk1 if=/tmp/1_0_259.asm count=1 conv=notrunc bs=1048576 seek=648
dd of=/dev/asm-disk3 if=/tmp/2_0_256.asm count=1 conv=notrunc bs=1048576 seek=51
SQL>

备份还原文件头测试–通过测试证明该方法备份文件头是ok的
关闭数据库,使用dd备份文件头

[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Fri May 1 04:21:49 2015
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

dul查看当前dbname值为XIFENFEI

[oracle@xifenfei dul]$ ./dul
Data UnLoader: 10.2.0.6.5 - Internal Only - on Fri May  1 04:37:43 2015
with 64-bit io functions
Copyright (c) 1994 2015 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
Disk group DATA, dul group_cid 0
Discovered disk /dev/asm-disk1 as diskgroup DATA, disk number 0 size 3922 Mb File1 starts at 2, dul_disk_cid 0
Discovered disk /dev/asm-disk2 as diskgroup DATA, disk number 1 size 3922 Mb without File1 meta data, dul_disk_cid 1
Disk group XIFENFEI, dul group_cid 1
Discovered disk /dev/asm-disk3 as diskgroup XIFENFEI, disk number 0 size 4439 Mb File1 starts at 2, dul_disk_cid 2
DUL: Warning: Dictionary cache DC_ASM_EXTENTS is empty
Probing for attributes in File9, the attribute directory, for disk group DATA
attribute name "_extent_sizes", value "1 4 16"
attribute name "_extent_counts", value "20000 20000 2147483647"
Oracle data file size 775954432 bytes, block size 8192
Found db_id = 1495013434
Found db_name = XIFENFEI   <-----db name
DUL: Error: Filedir block not allocated, file does not exist
DUL: Error: Could not load asm meta data for group XIFENFEI file 9
Probing for filenames in File6, the alias directory, for disk group XIFENFEI
+XIFENFEI/XIFENFEI/DATAFILE/XIFENFEI.256.878397315
Probing for database datafiles in File1, the file directory,  for disk group XIFENFEI
File 256 datafile size 104865792, block size 8192
Disk group XIFENFEI has one file of type datafile

使用dd备份1文件头

[oracle@xifenfei tmp]$ dd if=/dev/asm-disk1 of=/tmp/1_0_256.asm count=1 bs=1048576 skip=29
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0168209 seconds, 62.3 MB/s

尝试把dbname从XIFENFEI修改为ORCL

SQL> select dump('XIFENFEI',16) from dual;
DUMP('XIFENFEI',16)
-------------------------------------
Typ=96 Len=8: 58,49,46,45,4e,46,45,49
SQL> SELECT DUMP('ORCL',16) FROM DUAL;
DUMP('ORCL',16)
-------------------------
Typ=96 Len=4: 4f,52,43,4c
SQL>

bbed修改XIFENFEI为ORCL

[oracle@xifenfei tmp]$ bbed filename='/tmp/1_0_256.asm' mode=edit
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Fri May 1 04:24:06 2015
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set blocksize 8192
        BLOCKSIZE       8192
BBED> set block 1
        BLOCK#          1
BBED> map
 File: /tmp/1_0_256.asm (0)
 Block: 1                                     Dba:0x00000000
------------------------------------------------------------
 Data File Header
 struct kcvfh, 860 bytes                    @0
 ub4 tailchk                                @8188
BBED> p kcvfhhdr
struct kcvfhhdr, 76 bytes                   @20
   ub4 kccfhswv                             @20       0x00000000
   ub4 kccfhcvn                             @24       0x0b200400
   ub4 kccfhdbi                             @28       0x591c183a
   text kccfhdbn[0]                         @32      X
   text kccfhdbn[1]                         @33      I
   text kccfhdbn[2]                         @34      F
   text kccfhdbn[3]                         @35      E
   text kccfhdbn[4]                         @36      N
   text kccfhdbn[5]                         @37      F
   text kccfhdbn[6]                         @38      E
   text kccfhdbn[7]                         @39      I
BBED> d seek 32
 File: /tmp/1_0_256.asm (0)
 Block: 1                seeks:   32 to   63           Dba:0x00000000
------------------------------------------------------------------------
 58494645 4e464549 12040000 00720100 00200000 01000300 00000000 00000000
 <32 bytes per line>

dd把修改的block还原到asm中

[oracle@xifenfei dul]$ dd of=/dev/asm-disk1 if=/tmp/1_0_256.asm count=1 conv=notrunc bs=1048576 seek=29
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00253244 seconds, 414 MB/s

dul验证dbname 修改为ORCL成功

[oracle@xifenfei dul]$ ./dul
Data UnLoader: 10.2.0.6.5 - Internal Only - on Fri May  1 04:41:33 2015
with 64-bit io functions
Copyright (c) 1994 2015 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
Disk group DATA, dul group_cid 0
Discovered disk /dev/asm-disk1 as diskgroup DATA, disk number 0 size 3922 Mb File1 starts at 2, dul_disk_cid 0
Discovered disk /dev/asm-disk2 as diskgroup DATA, disk number 1 size 3922 Mb without File1 meta data, dul_disk_cid 1
Disk group XIFENFEI, dul group_cid 1
Discovered disk /dev/asm-disk3 as diskgroup XIFENFEI, disk number 0 size 4439 Mb File1 starts at 2, dul_disk_cid 2
DUL: Warning: Dictionary cache DC_ASM_EXTENTS is empty
Probing for attributes in File9, the attribute directory, for disk group DATA
attribute name "_extent_sizes", value "1 4 16"
attribute name "_extent_counts", value "20000 20000 2147483647"
Oracle data file size 775954432 bytes, block size 8192
Found db_id = 1495013434
Found db_name = ORCL   <----修改后的dbname
DUL: Error: Filedir block not allocated, file does not exist
DUL: Error: Could not load asm meta data for group XIFENFEI file 9
Probing for filenames in File6, the alias directory, for disk group XIFENFEI
+XIFENFEI/XIFENFEI/DATAFILE/XIFENFEI.256.878397315
Probing for database datafiles in File1, the file directory,  for disk group XIFENFEI
File 256 datafile size 104865792, block size 8192
Disk group XIFENFEI has one file of type datafile

对于asm无法mount情况下备份asm disk header
asm磁盘的备份主要是备份磁盘头100M空间,使用dd命令直接备份

set lines 150
set pages 1000
select 'dd if='||path||' of=&asmbackup_dir/'||group_number||'_'||disk_number||'.asm bs=1048576
count=100' from v$asm_disk;
set lines 150
set pages 1000
select 'dd of='||path||' if=&asmbackup_dir/'||group_number||'_'||disk_number||'.asm bs=1048576
count=100 conv=notrunc' from v$asm_disk;

asmlib需要注意把ORCL:替换为/dev/oracleasm/disks/对应目录.

Oracle异常恢复前备份保护现场建议—FileSystem环境

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:Oracle异常恢复前备份保护现场建议—FileSystem环境

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

无论是在各种会议上,还是在朋友/网友私下请教Oracle数据库恢复的问题之时,我都强调,如果你没有十足的把握,请你对您的现场进行备份,确保别对现场进行二次损坏。你不能恢复数据库,但绝对不能再次破坏数据库,给二次恢复增加难度.这里对恢复前备份提供一些指导思想和简单脚本,希望对大家有帮助.

哪些文件需要备份
熟悉数据库恢复的朋友可能都情况,Oracle在异常恢复的过程中主要修改的是system表空间里面数据,其他数据文件,redo数据,控制文件(当然由于redo,undo导致其他数据文件内部的block也可能发生改变)。在备份时间,备份空间允许的情况下,是对这些文件全部备份为好

完整备份文件

set lines 150
set pages 10000
select name from v$datafile
union all
select name from v$controlfile
union all
select member from v$logfile;

有些情况下:比如如果全部备份时间过长,备份空间不足等情况下,我们该如何备份,尽量减少因为异常恢复导致对原环境的损坏.备份最核心的system表空间,数据文件头,redo file,control file等数据,由于这个不是简单的拷贝操作,因此在生成备份语句同时,也生成还原语句,切不可生成了备份语句后,无恢复语句,导致后面还原故障现场难度增大.

无法全备情况下linux/unix数据库恢复前备份

set lines 150
set pages 10000
select 'dd if='||name||' of=&&back_dir/'||ts#||'_'||file#||'.dbf bs=1048576 count=10'
from v$datafile where ts#<>0
union all
select 'dd if='||name||' of=&&back_dir/'||ts#||'_'||file#||'.dbf' from v$datafile where ts#=0
union all
select 'dd if='||name||' of=&&back_dir/control0'||rownum||'.ctl' from v$controlfile
union all
select 'dd if='||member||' of=&&back_dir/'||thread#||'_'||a.group#||'_'||sequence#||'_'||substr(member,
instr(member,'/',-1)+1)  FROM v$log a, v$logfile b WHERE a.group# = B.GROUP#;

无法全备情况下linux/unix使用备份还原

set lines 150
set pages 1000
select 'dd of='||name||' if=&&back_dir/'||ts#||'_'||file#||'.dbf bs=1048576 count=10 conv=notrunc'
from v$datafile where ts#<>0
union all
select 'dd if='||name||' if=&&back_dir/'||ts#||'_'||file#||'.dbf' from v$datafile where ts#=0
union all
select 'dd of='||name||' if=&&back_dir/control0'||rownum||'.ctl' from v$controlfile
union all
select 'dd of='||member||' if=&&back_dir/'||thread#||'_'||a.group#||'_'||sequence#||'_'||substr(member,
instr(member,'/',-1)+1)    FROM v$log a, v$logfile b WHERE a.group# = B.GROUP#;

由于win路径斜杠不一样(/和\的区别),因此在无法全备情况下win备份语句

set lines 150
set pages 10000
select 'dd if='||name||' of=&&back_dir\'||ts#||'_'||file#||'.dbf bs=1048576 count=10'
from v$datafile where ts#<>0
union all
select 'dd if='||name||' of=&&back_dir\'||ts#||'_'||file#||'.dbf' from v$datafile where ts#=0
union all
select 'dd if='||name||' of=&&back_dir\control0'||rownum||'.ctl' from v$controlfile
union all
select 'dd if='||member||' of=&&back_dir\'||thread#||'_'||a.group#||'_'||sequence#||'_'||substr(member,
instr(member,'\',-1)+1)   FROM v$log a, v$logfile b WHERE a.group# = B.GROUP#;

在无法全备情况下win还原语句

set lines 150
set pages 1000
select 'dd of='||name||' if=&&back_dir\'||ts#||'_'||file#||'.dbf bs=1048576 count=10 conv=notrunc'
from v$datafile where ts#<>0
union all
select 'dd if='||name||' if=&&back_dir\'||ts#||'_'||file#||'.dbf' from v$datafile where ts#=0
union all
select 'dd of='||name||' if=&&back_dir\control0'||rownum||'.ctl' from v$controlfile
union all
select 'dd of='||member||' if=&&back_dir\'||thread#||'_'||a.group#||'_'||sequence#||'_'||substr(member,
instr(member,'\',-1)+1)    FROM v$log a, v$logfile b WHERE a.group# = B.GROUP#;

这里提供win环境下dd命令程序win环境dd命令工具

备注:对于asm情况异常情况恢复,备份情况请不要参考该文章,具体请见后续文章,具体见Oracle异常恢复前备份保护现场建议—ASM环境