YOUR FILES ARE ENCRYPTED!—ORACLE比特币勒索恢复支持

最近Oracle数据库被黑勒索比特币的事件一起又一起,相比是最近黑客们是不是都想着弄点钱回家好过春节.前段时间分析过关于由于绿色版plsql dev的afterconnect.sql被人注入了恶意的脚本的事件具体见:plsql dev引起的数据库被黑勒索比特币实现原理分析和解决方案,对该种故障进行了分析,也提供了预防和解决方案.今天对于另一起比较常见比较常见的黑文件系统,勒索比特币的案例进行了分析,并且成功恢复出来数据.
被黑后现象
How to restore files


数据文件被加密
20161126191331
bbed分析文件

C:\Users\XIFENFEI>bbed password=blockedit
BBED: Release 2.0.0.0.0 - Limited Production on Sat Nov 26 19:15:00 2016
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename 'D:\TEMP\BAK\GCwqPFVL2zzGGBCo.orgasm'
        FILENAME        D:\TEMP\BAK\GCwqPFVL2zzGGBCo.orgasm
BBED> set blocksize 8192
        BLOCKSIZE       8192
BBED> d
 File: D:\TEMP\BAK\GCwqPFVL2zzGGBCo.orgasm (0)
 Block: 1                Offsets:    0 to  511           Dba:0x00000000
------------------------------------------------------------------------
 f993499a db22faf6 c4ee1bde efbedec0 d7c0e2a0 5408896f e3f88e2f c3179373
 0de737db 595477d5 a12c64ae 08341d67 ff93dd1f ef04568b 62f46c03 315605db
 25affd5c 9be11af8 4eff4028 26081235 8fa1f832 6e02ab4a 6295a24a 4021a418
 a4a5c934 d4ecea4d f614f413 04869513 dec22a31 43fd7a5b 51a8b29b 656e1963
 955f540a 892dff97 6a681024 c5a57176 c6168175 28e2fea6 2439fd0c cf3c8c99
 487e909d 0951aded 8ba53a61 542b1f9a cb3c5d8b 21d25c82 fbb31554 20977f8b
 f46ba6d0 5e47328a d8c4d634 97bfeefa 4b632c38 a35a2f19 06fa9be2 b1984eab
 d9a30fe2 67513bc1 657897b2 f12a2e90 acee0bd2 8bf8af82 cfc79417 f8afd2c6
 6bab98cd 16e9ddc5 40e1705f b5763002 505b8964 805c6361 a7a84980 b7826019
 a4dafb89 0b0f27c0 fa19924e c3956a43 7815f01c e9eae471 c23e52bd 4b76556c
 40895e77 c1fd70c0 7dff4132 4189788e 895590b5 57469886 90ebc360 796e426c
 913026ed b4ea7026 41f18de1 e174a0c8 7a325eb9 edb1b296 dbfc1948 25f7af82
 df999d8b c8106e84 134531ea 5f5c6461 8fd4fc3e b04be820 4c838a2f 045d818d
 8eeace89 ac1ee884 b8d56032 9fc580c4 e288fd13 6f80f6d3 f176df8b 47645955
 738a9bbd 1c0ce841 395d3d84 a071937a 74991167 41a5bdf3 970a283b 4c6ecd25
 c1f0b4da 9115574b 7cdf44c6 4f799113 a6edca86 f62a20fb 354e433b 4f2250f5
 <32 bytes per line>
BBED> set block 2
        BLOCK#          2
BBED> d
 File: D:\TEMP\BAK\GCwqPFVL2zzGGBCo.orgasm (0)
 Block: 2                Offsets:    0 to  511           Dba:0x00000000
------------------------------------------------------------------------
 f19a5ede 35f9e2ed d51d9b35 ef28cfe0 9deb6021 e018bf6f 2d84be28 779ee63a
 cf3172ff 49f3b959 ef92728b c0a7129a 7335afb8 a1e0fe5c 7fc5b213 8a2afc78
 78ddd22e 9ff63c2e 6432a073 adad2138 b18ee56d fab16ba3 688968f5 b53d03ca
 1f1d80d3 a87bb038 38c84b6a 74f253bc 55efc8f9 e2c1d194 26803ccc 575300b2
 025eff5c 824bca6b 440e3cc3 2f48a704 b3db6db8 bf48903e 04bf7efa 019f0986
 264806ab a8a93048 1f2d7b4e b29a923b 61a701d0 d6783f69 027f06db f4d16fba
 4b9bbd68 3a32fa66 e9e18a4c 7332a908 7e9fab7e cc8810dc 438d7157 397467b1
 d8d0e972 4b892411 bfc1bab2 6e247b4c ad2b05a8 be799d07 d1226408 0ff00bc1
 3943c5aa 63182479 6c84e5db ab213221 736af62b bb9dc047 d4a28e40 8451119c
 6db794cc 5df39d30 592d7656 0a76048d 9b5d3b3d 7d85ccb8 796c5809 ae1122e1
 73006061 d22d0dfe 9a7b839a a5c32d6a cd8ad956 5e2a8013 280fc444 9807b477
 3eda5bca 0f6a2958 e0334dfc 52c23a95 fa2cfaff a86b1456 c74a0cd9 eec77fe6
 96261513 0044e3ef ef843e13 004a9ef2 ef01d670 6c988cb7 df0dea99 58ff78ac
 aa5783e8 b6e2b89d da953d7b 3b4ea7fa 8352d388 eb6a8d76 9b9525a0 f34d444c
 83d60651 6ff7f287 bd9f8bcf 5dae9592 32db539d d0c14939 0ab4e403 ff537cfa
 9657a1be 3e5aa43d 6fdf56fd 90dbd567 b7fe4aeb d3226a29 075da375 7c3d7581
 <32 bytes per line>

通过我们的一系列分析,对于这种数据文件里面的数据已经被加密,直接使用他们肯定是无法恢复出来其中的数据,我们通过一系列的分析,已经成功找出来他们其中的规律,对这种故障实现完美恢复,如果有这个方面的恢复请求,可以随时联系我们
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

通过拷贝block实现system文件大量坏块恢复

有朋友找到我,他有客户库大量坏块,需要我们提供支持,因为这个库里面含有大量的存储过程,包等,要求数据要直接导出,不能使用工具挖.
dbv检查system大量坏块

DBVERIFY: Release 11.2.0.4.0 - Production on 星期二 11月 22 17:17:51 2016
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - 开始验证: FILE = h:\oracle\system01.dbf
页 4543 流入 - 很可能是介质损坏
Corrupt block relative dba: 0x004011bf (file 1, block 4543)
Fractured block found during dbv:
Data in bad block:
 type: 0 format: 2 rdba: 0x004011bf
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x7641344a
 check value in block header: 0xb6ff
 computed block checksum: 0x797
页 4544 标记为损坏
Corrupt block relative dba: 0x004011c0 (file 1, block 4544)
Bad header found during dbv:
Data in bad block:
 type: 71 format: 3 rdba: 0x754e362f
 last change scn: 0x7a37.6d424862 seq: 0x39 flg: 0x32
 spare1: 0x35 spare2: 0x32 spare3: 0x3931
 consistency value in tail: 0x7638356c
 check value in block header: 0x4856
 block checksum disabled
 …………
页 4613 标记为损坏
Corrupt block relative dba: 0x00401205 (file 1, block 4613)
Bad header found during dbv:
Data in bad block:
 type: 97 format: 7 rdba: 0x79634449
 last change scn: 0x4364.77426a4c seq: 0x41 flg: 0x35
 spare1: 0x34 spare2: 0x36 spare3: 0x7734
 consistency value in tail: 0x505a4550
 check value in block header: 0x434d
 computed block checksum: 0x6f3f
页 4614 标记为损坏
Corrupt block relative dba: 0x00401206 (file 1, block 4614)
Completely zero block found during dbv:
…………
页 5125 标记为损坏
Corrupt block relative dba: 0x00401405 (file 1, block 5125)
Completely zero block found during dbv:
DBVERIFY - 验证完成
检查的页总数: 124160
处理的页总数 (数据): 90745
失败的页总数 (数据): 0
处理的页总数 (索引): 14417
失败的页总数 (索引): 0
处理的页总数 (其他): 3323
处理的总页数 (段)  : 1
失败的总页数 (段)  : 0
空的页总数: 15092
标记为损坏的总页数: 583
流入的页总数: 5
加密的总页数        : 0
最高块 SCN            : 1417256245 (2.1417256245)

这里比较明显,一共583个坏块,而且是连续坏块(5125-4543+1)

尝试启动数据库

--直接尝试打开数据库
SQL> RECOVER DATABASE;
完成介质恢复。
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01578: ORACLE data block corrupted (file # 1, block # 4575)
ORA-01110: data file 1: 'H:\ORACLE\SYSTEM01.DBF'
进程 ID: 2572
会话 ID: 85 序列号: 1
--跳过坏块event打开库
SQL> startup mount pfile='h:/oracle/pfile.txt'
ORACLE 例程已经启动。
Total System Global Area 2137886720 bytes
Fixed Size                  2282944 bytes
Variable Size             520096320 bytes
Database Buffers         1610612736 bytes
Redo Buffers                4894720 bytes
数据库装载完毕。
SQL> show parameter event;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
event                                string      43810 trace name context forev
                                                 er, level 10, 10231 trace name
                                                  context forever, level 10, 10
                                                 232 trace name context forever
                                                 , level 10, 10233 trace name c
                                                 ontext forever, level 10, 1004
                                                 1 trace name context forever,
                                                 level 10
xml_db_events                        string      enable
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-01113: 文件 1 需要介质恢复
ORA-01110: 数据文件 1: 'H:\ORACLE\SYSTEM01.DBF'
SQL> recover database;
完成介质恢复。
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kokiasg1], [], [], [], [], [], [],
[], [], [], [], []
进程 ID: 9316
会话 ID: 4 序列号: 3
--upgrade方式打开数据库
SQL> startup mount pfile='h:/oracle/pfile.txt'
ORACLE 例程已经启动。
Total System Global Area 2137886720 bytes
Fixed Size                  2282944 bytes
Variable Size             520096320 bytes
Database Buffers         1610612736 bytes
Redo Buffers                4894720 bytes
数据库装载完毕。
SQL> alter database open upgrade
  2  ;
alter database open upgrade
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kokiasg1], [], [], [], [], [], [],
[], [], [], [], []
进程 ID: 7976
会话 ID: 4 序列号: 3

至此数据库在这种坏块情况下,正常打开相当渺茫,因为报错的这些block 都是非常靠前的,也就是说这个里面很多块在数据库创建好的时候就已经在了(特别是通过模板创建的数据库,这些部分很可能都是固定的),考虑使用其他库的block来替代这些坏块,然后尝试打开库

修复坏块

[oracle@app101-20 ~]$ dd if=/Data/oracle/oradata/txlhdb/system01.dbf of=/tmp/1.dbf skip=4543 bs=8192 count=583
583+0 records in
583+0 records out
4775936 bytes (4.8 MB) copied, 0.0533578 s, 89.5 MB/s
H:\oracle>dd if=d:/temp/1.dbf of=h:\oracle\system01.dbf seek=4543 bs=8192 count=583 conv=notrun
rawwrite dd for windows version 0.6beta3.
Written by John Newbigin <jn@it.swin.edu.au>
This program is covered by terms of the GPL Version 2.
notrun
583+0 records in
583+0 records out
H:\oracle>dbv file=system01.dbf
DBVERIFY: Release 11.2.0.4.0 - Production on 星期二 11月 22 20:17:51 2016
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - 开始验证: FILE = H:\ORACLE\SYSTEM01.DBF
DBVERIFY - 验证完成
检查的页总数: 124160
处理的页总数 (数据): 90761
失败的页总数 (数据): 0
处理的页总数 (索引): 14479
失败的页总数 (索引): 0
处理的页总数 (其他): 3393
处理的总页数 (段)  : 1
失败的总页数 (段)  : 0
空的页总数: 15527
标记为损坏的总页数: 0
流入的页总数: 0
加密的总页数        : 0
最高块 SCN            : 295310052 (11.295310052)

再次尝试打开数据库

C:\Users\XIFENFEI>sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on 星期二 11月 22 20:18:19 2016
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
已连接到空闲例程。
SQL> startup mount pfile='h:/oracle/pfile.txt';
ORACLE 例程已经启动。
Total System Global Area 2137886720 bytes
Fixed Size                  2282944 bytes
Variable Size             520096320 bytes
Database Buffers         1610612736 bytes
Redo Buffers                4894720 bytes
数据库装载完毕。
SQL> recover database;
完成介质恢复。
SQL> alter database open;
数据库已更改。

后续错误ORA-04023处理

C:\Users\XIFENFEI>exp "'/ as sysdba'" owner=XIFENFEI file=d:/full_xff.dmp lo
g=d:/full_xff.log   FEEDBACK=10000  COMPRESS=NO BUFFER=102400000 STATISTICS=none
Export: Release 11.2.0.4.0 - Production on 星期二 11月 22 20:20:27 2016
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Produc
tion
With the Partitioning, OLAP, Data Mining and Real Application Testing options
EXP-00056: 遇到 ORACLE 错误 4023
ORA-04023: 无法验证或授权对象 SELECT xdb_uid FROM SYS.EXU9XDBUID
EXP-00000: 导出终止失败

数据库所有视图无法查询,通过直接对基表user$,obj$,view$等表查询出来视图信息,然后直接编译,然后数据可以完美导出,完成本次恢复

SQL> select 'alter view '||b.name||'.'||c.name||'  compile;'
  2  from view$ a,user$ b,obj$ c
  3  where a.obj#=c.obj#
  4  and c.owner#=b.user#;

1


解决CON$ ORA-600 kdsgrp1错误

数据库报ORA 600 kdsgrp1错误
数据库报ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []错

Thread 1 advanced to log sequence 23861 (LGWR switch)
  Current log# 7 seq# 23861 mem# 0: /oradata/easdb/redo07.log
Tue Nov 15 10:00:42 2016
Errors in file /u01/oracle/diag/rdbms/easdb/easdb/trace/easdb_dw00_3165.trc  (incident=908262):
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/oracle/diag/rdbms/easdb/easdb/incident/incdir_908262/easdb_dw00_3165_i908262.trc
Tue Nov 15 10:00:55 2016
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue Nov 15 10:00:56 2016
Errors in file /u01/oracle/diag/rdbms/easdb/easdb/trace/easdb_dw00_3165.trc  (incident=908263):
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.KUPW$WORKER", line 1751
ORA-06512: at line 2
Incident details in: /u01/oracle/diag/rdbms/easdb/easdb/incident/incdir_908263/easdb_dw00_3165_i908263.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
DW00 terminating with fatal err=600, pid=40, wid=1, job SYSTEM.
Tue Nov 15 10:01:01 2016
Thread 1 advanced to log sequence 23862 (LGWR switch)
  Current log# 2 seq# 23862 mem# 0: /oradata/easdb/redo02.log
Tue Nov 15 10:01:23 2016
Errors in file /u01/oracle/diag/rdbms/easdb/easdb/trace/easdb_dm00_3163.trc  (incident=908254):
ORA-31671: Worker process DW00 had an unhandled exception.
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.KUPW$WORKER", line 1751
ORA-06512: at line 2
Incident details in: /u01/oracle/diag/rdbms/easdb/easdb/incident/incdir_908254/easdb_dm00_3163_i908254.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue Nov 15 10:01:26 2016
Tue Nov 15 10:01:28 2016
Thread 1 advanced to log sequence 23863 (LGWR switch)
  Current log# 4 seq# 23863 mem# 0: /oradata/easdb/redo04.log

trace文件中信息

*** 2016-11-15 10:00:35.977
* kdsgrp1-1: *************************************************
            row 0x004459e6.26 continuation at
            0x004459e6.26 file# 1 block# 285158 slot 38 not found
KDSTABN_GET: 0 ..... ntab: 1
curSlot: 38 ..... nrows: 208
kdsgrp - dump CR block dba=0x004459e6
Block header dump:  0x004459e6
 Object id on Block? Y
 seg/obj: 0x1c  csc: 0x01.c712f743  itc: 3  flg: -  typ: 1 - DATA
     fsl: 0  fnx: 0x0 ver: 0x01
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000b.015.0036d715  0x00c01bba.0fbd.02  C---    0  scn 0x0001.c6b4cb1a
0x02   0x000c.004.00044d36  0x04c0dd93.3eec.33  C---    0  scn 0x0001.c6d2c65b
0x03   0x000d.008.00008eb9  0x04c0777a.10e3.02  --U-    2  fsc 0x0056.c7346f21

确定报错对象和确认异常

SQL> select object_name from dba_objects where object_id=28;
OBJECT_NAME
---------------------------------------------------------
CON$
SQL> ANALYZE TABLE sys.CON$ VALIDATE STRUCTURE CASCADE online;
ANALYZE TABLE sys.CON$ VALIDATE STRUCTURE CASCADE online
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file
SQL> SET LINES 122
SQL> COL INDEX_OWNER FOR A20
SQL> COL INDEX_NAME FOR A30
SQL> COL TABLE_OWNER FOR A20
SQL> COL COLUMN_NAME FOR A25
SQL> SELECT TABLE_OWNER,INDEX_NAME,COLUMN_NAME,COLUMN_POSITION
2  FROM Dba_Ind_Columns
3  WHERE table_name = upper('&TABLE_NAME') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION;
Enter value for table_name: CON$
old   3:  WHERE table_name = upper('&TABLE_NAME') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION
new   3:  WHERE table_name = upper('CON$') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION
TABLE_OWNER	     INDEX_NAME 		    COLUMN_NAME 	      COLUMN_POSITION
-------------------- ------------------------------ ------------------------- ---------------
SYS		     I_CON1			    OWNER#				    1
SYS		     I_CON1			    NAME				    2
SYS		     I_CON2			    CON#				    1
SQL> select owner#,name from con$
2    minus
3   select /*+ full(t) */owner#,name from con$ t;
no rows selected
SQL> select /*+ full(t) */owner#,name from con$ t
2    minus
3   select owner#,name from con$  ;
no rows selected
SQL> select /*+ full(t) */ con# from con$ t
2    minus
3   select con# from con$ ;
no rows selected
SQL> select con# from con$
2    minus
3   select /*+ full(t) */ con# from con$ t   ;
      CON#
----------
   1037224
   1037225
   1037386
   1037387
   1037388
   ……
   1037846
62 rows selected.

通过上述分析,可以确定是由于CON$和I_CON2数据不一致,而且是index的数据比表中多了62条.针对这样情况,考虑通过重建index来解决.

尝试rebuild index

SQL> alter index I_CON2 rebuild online;
alter index I_CON2 rebuild online
*
ERROR at line 1:
ORA-00701: object necessary for warmstarting database cannot be altered
SQL>
SQL>
SQL>
SQL>
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup upgrade
ORACLE instance started.
Total System Global Area 2421825536 bytes
Fixed Size                  2215744 bytes
Variable Size            1828716736 bytes
Database Buffers          570425344 bytes
Redo Buffers               20467712 bytes
Database mounted.
Database opened.
SQL> alter index I_CON2 rebuild;
alter index I_CON2 rebuild
*
ERROR at line 1:
ORA-00701: object necessary for warmstarting database cannot be altered

因为是数据库核心index,无法直接rebuild解决,只能通过bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决 方式解决

plsql dev引起的数据库被黑勒索比特币实现原理分析和解决方案

数据库启动alert报错

Mon Jul 10 19:51:24 2016
Errors in file e:\app\administrator\diag\rdbms\zhxh\zhxh\trace\zhxh_ora_3584.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-20313: 你的数据库已被SQL RUSH Team锁死  发送5个比特币到这个地址 166xk1FXMB2g8JxBVF5T4Aw1Z5JaZ6vrSE (大小写一致)  之后把你的Oracle SID邮寄地址 sqlrush@mail.com 我们将让你知道如何解锁你的数据库  Hi buddy, your database was hacked by SQL RUSH Team, send 5 bitcoin to address 166xk1FXMB2g8JxBVF5T4Aw1Z5JaZ6vrSE (case sensitive),  after that send your Oracle SID to mail address sqlrush@mail.com, we will let you know how to unlock your database.
ORA-06512: 在 "ZHXH.DBMS_SYSTEM_INTERNAL         ", line 15
ORA-06512: 在 line 2
Mon Jul 10 19:51:30 2016
OER 7451 in Load Indicator : Error Code = OSD-04500: 指定了非法选项
O/S-Error: (OS 1) 函数不正确。 !
Mon Jul 10 19:51:34 2017
Errors in file e:\app\administrator\diag\rdbms\zhxh\zhxh\trace\zhxh_ora_824.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-20313: 你的数据库已被SQL RUSH Team锁死  发送5个比特币到这个地址 166xk1FXMB2g8JxBVF5T4Aw1Z5JaZ6vrSE (大小写一致)  之后把你的Oracle SID邮寄地址 sqlrush@mail.com 我们将让你知道如何解锁你的数据库  Hi buddy, your database was hacked by SQL RUSH Team, send 5 bitcoin to address 166xk1FXMB2g8JxBVF5T4Aw1Z5JaZ6vrSE (case sensitive),  after that send your Oracle SID to mail address sqlrush@mail.com, we will let you know how to unlock your database.
ORA-06512: 在 "ZHXH.DBMS_SYSTEM_INTERNAL         ", line 15
ORA-06512: 在 line 2

afterconnect.sql
是plsql dev登录后自动执行脚本,非Oralce官方脚本
1


数据库启动后执行触发器DBMS_SUPPORT_INTERNAL
2
3


DBMS_SUPPORT_INTERNAL主要的意义是:
1. 当数据库创建时间大于1200天之后,开始备份tab$表
2. 删除tab$中除掉owner#为0和38的记录(sys,xdb)
4


3. 通过SYS.DBMS_BACKUP_RESTORE.RESETCFILESECTION清理掉备份信息(v$controlfile_record_section)
4. 然后通过DBMS_SYSTEM.KSDWRT在你的alert日志中写上2046次的提示信息
Hi buddy, your database was hacked by SQL RUSH Team, send 5 bitcoin to address 166xk1FXMB2g8JxBVF5T4Aw1Z5JaZ6vrSE (case sensitive), after that send your Oracle SID to mail address sqlrush@mail.com, we will let you know how to unlock your database.
你的数据库已被SQL RUSH Team锁死 发送5个比特币到这个地址 166xk1FXMB2g8JxBVF5T4Aw1Z5JaZ6vrSE (大小写一致) 之后把你的Oracle SID邮寄地址 sqlrush@mail.com 我们将让你知道如何解锁你的数据库
5. 再抛出一个前台的和4类似的警告信息
数据库登录触发器DBMS_SYSTEM_INTERNAL
6
7


当你的非SYSTEM,SYSAUX,EXAMPLE之外的所有表的最小统计信息时间大于1200天,而且非C89239.EXE程序,就会报出来” 你的数据库已被SQL RUSH Team锁死 发送5个比特币到这个地址 166xk1FXMB2g8JxBVF5T4Aw1Z5JaZ6vrSE (大小写一致) 之后把你的Oracle SID邮寄地址 sqlrush@mail.com 我们将让你知道如何解锁你的数据库 Hi buddy, your database was hacked by SQL RUSH Team, send 5 bitcoin to address 166xk1FXMB2g8JxBVF5T4Aw1Z5JaZ6vrSE
(case sensitive), after that send your Oracle SID to mail address sqlrush@mail.com, we will let you know how to unlock your database.”的信息
数据库登录触发器DBMS_CORE_INTERNAL
8
9
5


这里比较明显,把表名不含$,不含ORACHK,不是cluster的表放到一个游标里面,然后取非SYSTEM,SYSAUX,EXAMPLE之外的表空间的表的最小统计信息收集时间和当前时间比较如果大于1200天就执行truncate table操作,操作完成之后判断如果登录程序不为C89239.EXE,则报出来异常,” 你的数据库已被SQL RUSH Team锁死 发送5个比特币到这个地址 166xk1FXMB2g8JxBVF5T4Aw1Z5JaZ6vrSE (大小写一致) 之后把你的Oracle SID邮寄地址 sqlrush@mail.com 我们将让你知道如何解锁你的数据库 Hi buddy, your database was hacked by SQL RUSH Team, send 5 bitcoin to address 166xk1FXMB2g8JxBVF5T4Aw1Z5JaZ6vrSE
(case sensitive), after that send your Oracle SID to mail address sqlrush@mail.com, we will let you know how to unlock your database.”。
对于这次故障处理方法
1. 如果SELECT NVL(TO_CHAR(SYSDATE-MIN(LAST_ANALYZED)),0) FROM ALL_TABLES WHERE TABLESPACE_NAME NOT IN (‘SYSTEM’,’SYSAUX’,’EXAMPLE’); 小于1200,查询下列语句,然后删除掉(正常库查询为空)
2. 如果SYSDATE-MIN(LAST_ANALYZED)大于1200, SYSDATE-CREATED大于1200天未重启,或者SYSDATE-CREATED小于1200;就是tab$还未被清理,但是表被truncate,这样情况可以通过oracle原厂dul工具恢复
3. 如果SYSDATE-CREATED大于1200天,而且数据库重启过,但是SYSDATE-MIN(LAST_ANALYZED)小于1200天,那可以直接通过把ORACHK’||SUBSTR(SYS_GUID,10)中备份信息插入到$tab中
4. SYSDATE-CREATED大于1200天,而且数据库重启过,但是SYSDATE-MIN(LAST_ANALYZED)大于1200天,Oracle 原厂dul之类工具结合ORACHK’||SUBSTR(SYS_GUID,10)备份表中数据进行恢复
dul


预防策略
1)数据库里面查询下,如果有这些对象,及时给与清理(注意% ‘中间有空格)

select 'DROP TRIGGER '||owner||'."'||TRIGGER_NAME||'";' from dba_triggers
where TRIGGER_NAME like  'DBMS_%_INTERNAL% '
union all
select 'DROP PROCEDURE '||owner||'."'||a.object_name||'";' from dba_procedures a
where a.object_name like 'DBMS_%_INTERNAL% ';
--注意% '之间的空格

2)建议业务用户尽量限制dba 权限
3)检查相关登陆工具的自动运行脚本 清理掉有风险脚本
sqlplus中的glogin.sql/login.sql
toad中的toad.ini
plsql dev中的login.sql/afterconnect.sql
4)建议从官方下载工具,不要使用绿色版/破解版等
如果不幸数据库被感染此种勒索比特币事件,而且无法自行恢复的,可以联系我们给予技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

ext3/ext4 超级块修复

创建ext4文件系统

[root@localhost ~]# mkfs.ext4 /dev/sdb1
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
1310720 inodes, 5242624 blocks
262131 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=2153775104
160 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000
Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
[root@localhost ~]# mkdir /sdb
[root@localhost ~]# mount /dev/sdb1 /sdb
[root@localhost ~]# df -h
Filesystem           Size  Used Avail Use% Mounted on
/dev/mapper/ol-root   36G  4.0G   32G  12% /
devtmpfs             1.8G     0  1.8G   0% /dev
tmpfs                1.8G     0  1.8G   0% /dev/shm
tmpfs                1.8G  8.9M  1.8G   1% /run
tmpfs                1.8G     0  1.8G   0% /sys/fs/cgroup
/dev/sda1            497M  195M  303M  40% /boot
tmpfs                369M     0  369M   0% /run/user/0
/dev/sdb1             20G   45M   19G   1% /sdb

准备测试数据

[root@localhost sdb]# cd /etc/sysctl.d/
[root@localhost sysctl.d]# ls
99-sysctl.conf
[root@localhost sysctl.d]# cp 99-sysctl.conf /sdb
[root@localhost sysctl.d]# more 99-sysctl.conf
# System default settings live in /usr/lib/sysctl.d/00-system.conf.
# To override those settings, enter new settings here, or in an /etc/sysctl.d/<name>.conf file
#
# For more information, see sysctl.conf(5) and sysctl.d(5).

破坏ext4文件系统

[root@localhost ~]#  dd if=/dev/zero of=/dev/sdb1 bs=1024 count=5
5+0 records in
5+0 records out
5120 bytes (5.1 kB) copied, 0.00270838 s, 1.9 MB/s
[root@localhost ~]# mount /dev/sdb1 /sdb
mount: unknown filesystem type '(null)'

日志报错

[ 8868.362628] sd 32:0:1:0: [sdb] Cache data unavailable
[ 8868.362632] sd 32:0:1:0: [sdb] Assuming drive cache: write through
[ 8868.363714]  sdb: sdb1
[ 8868.390297] sd 32:0:1:0: [sdb] Cache data unavailable
[ 8868.390301] sd 32:0:1:0: [sdb] Assuming drive cache: write through
[ 8868.391462]  sdb: sdb1
[ 8900.130143] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null)
[ 8900.130163] SELinux: initialized (dev sdb1, type ext4), uses xattr
[ 8902.803966] sdb1: WRITE SAME failed. Manually zeroing.

fsck修复

[root@localhost ~]# fsck -t ext4 /dev/sdb1
fsck from util-linux 2.23.2
e2fsck 1.42.9 (28-Dec-2013)
ext2fs_open2: Bad magic number in super-block
fsck.ext4: Superblock invalid, trying backup blocks...
/dev/sdb1 was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong for group #1 (31740, counted=31739).
Fix<y>? yes
Free blocks count wrong (5116302, counted=5116301).
Fix<y>? yes
Free inodes count wrong for group #0 (8181, counted=8180).
Fix<y>? yes
Free inodes count wrong (1310709, counted=1310708).
Fix<y>? yes
/dev/sdb1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sdb1: 12/1310720 files (0.0% non-contiguous), 126323/5242624 blocks

测试修复结果

[root@localhost ~]#
[root@localhost ~]#
[root@localhost ~]# mount /dev/sdb1 /sdb
[root@localhost ~]# df -h
Filesystem           Size  Used Avail Use% Mounted on
/dev/mapper/ol-root   36G  4.0G   32G  12% /
devtmpfs             1.8G     0  1.8G   0% /dev
tmpfs                1.8G     0  1.8G   0% /dev/shm
tmpfs                1.8G  8.9M  1.8G   1% /run
tmpfs                1.8G     0  1.8G   0% /sys/fs/cgroup
/dev/sda1            497M  195M  303M  40% /boot
tmpfs                369M     0  369M   0% /run/user/0
/dev/sdb1             20G   45M   19G   1% /sdb
[root@localhost ~]# cd /sdb
[root@localhost sdb]# ls
99-sysctl.conf  lost+found
[root@localhost sdb]# more 99-sysctl.conf
# System default settings live in /usr/lib/sysctl.d/00-system.conf.
# To override those settings, enter new settings here, or in an /etc/sysctl.d/<name>.conf file
#
# For more information, see sysctl.conf(5) and sysctl.d(5).

ext4文件系统修复

[root@localhost ~]# mkfs.ext3 /dev/sdb1
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
1310720 inodes, 5242624 blocks
262131 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
160 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000
Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
[root@localhost ~]# mount /dev/sdb1 /sdb
[root@localhost ~]# df -h
Filesystem           Size  Used Avail Use% Mounted on
/dev/mapper/ol-root   36G  4.0G   32G  12% /
devtmpfs             1.8G     0  1.8G   0% /dev
tmpfs                1.8G     0  1.8G   0% /dev/shm
tmpfs                1.8G  8.9M  1.8G   1% /run
tmpfs                1.8G     0  1.8G   0% /sys/fs/cgroup
/dev/sda1            497M  195M  303M  40% /boot
tmpfs                369M     0  369M   0% /run/user/0
/dev/sdb1             20G   45M   19G   1% /sdb
[root@localhost ~]# dd if=/dev/zero of=/dev/sdb1 bs=1024 count=5
5+0 records in
5+0 records out
5120 bytes (5.1 kB) copied, 0.0138915 s, 369 kB/s
[root@localhost ~]# fsck -t ext3 /dev/sdb1
fsck from util-linux 2.23.2
e2fsck 1.42.9 (28-Dec-2013)
ext2fs_open2: Bad magic number in super-block
fsck.ext3: Superblock invalid, trying backup blocks...
/dev/sdb1 was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sdb1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sdb1: 11/1310720 files (0.0% non-contiguous), 126322/5242624 blocks
[root@localhost ~]# mount /dev/sdb1 /sdb
[root@localhost ~]# df -h
Filesystem           Size  Used Avail Use% Mounted on
/dev/mapper/ol-root   36G  4.0G   32G  12% /
devtmpfs             1.8G     0  1.8G   0% /dev
tmpfs                1.8G     0  1.8G   0% /dev/shm
tmpfs                1.8G  8.9M  1.8G   1% /run
tmpfs                1.8G     0  1.8G   0% /sys/fs/cgroup
/dev/sda1            497M  195M  303M  40% /boot
tmpfs                369M     0  369M   0% /run/user/0
/dev/sdb1             20G   45M   19G   1% /sdb

fsck修复危险性较大,建议先备份对应的分区(dd命令备份分区)然后再处理,有导致分区数据全部或者部分丢失的风险,如果超级块彻底损坏无法恢复,请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

File #N is offline, but is part of an online tablespace

在看一个客户的数据库恢复日志的时候发现类似警告(File #N is offline, but is part of an online tablespace.),以前没有注意,这次通过试验来重现该部分内容
创建表空间

SQL> create tablespace readonly datafile '/home/oracle/.oradata/test/readonly01.dbf' size 128M;
Tablespace created.
SQL> alter tablespace  readonly add datafile '/home/oracle/.oradata/test/readonly02.dbf' size 128M;
Tablespace altered.

写入数据

SQL> create table t_readonly tablespace readonly as select * from dba_objects;
Table created.

read only 表空间

SQL> select count(*) from t_readonly;
  COUNT(*)
----------
     72226
SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> /
System altered.
SQL>  alter tablespace readonly  read only;
Tablespace altered.
SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> /
System altered.
SQL> /
System altered.

备份数据库

[oracle@localhost ~]$ rman target /
Recovery Manager: Release 11.2.0.1.0 - Production on Tue Nov 1 21:15:51 2016
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
connected to target database: TEST (DBID=2210907828)
RMAN> backup database format '/home/oracle/full_%U.rman';
Starting backup at 01-NOV-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=197 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/home/oracle/.oradata/test/system01.dbf
input datafile file number=00002 name=/home/oracle/.oradata/test/sysaux01.dbf
input datafile file number=00005 name=/home/oracle/.oradata/test/test01.dbf
input datafile file number=00006 name=/home/oracle/.oradata/test/test02.dbf
input datafile file number=00007 name=/home/oracle/.oradata/test/readonly01.dbf
input datafile file number=00008 name=/home/oracle/.oradata/test/readonly02.dbf
input datafile file number=00003 name=/home/oracle/.oradata/test/undotbs01.dbf
input datafile file number=00004 name=/home/oracle/.oradata/test/users01.dbf
channel ORA_DISK_1: starting piece 1 at 01-NOV-16
channel ORA_DISK_1: finished piece 1 at 01-NOV-16
piece handle=/home/oracle/full_03rjrp0t_1_1.rman tag=TAG20161101T211613 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 01-NOV-16
channel ORA_DISK_1: finished piece 1 at 01-NOV-16
piece handle=/home/oracle/full_04rjrp1m_1_1.rman tag=TAG20161101T211613 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 01-NOV-16
RMAN> sql 'alter system archive log current';
backup  as compressed backupset archivelog  all format '/home/oracle/arch_%T_%U.rman'  delete input;
sql statement: alter system archive log current
RMAN>
Starting backup at 01-NOV-16
current log archived
using channel ORA_DISK_1
channel ORA_DISK_1: starting compressed archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=494 RECID=1 STAMP=926802386
input archived log thread=1 sequence=495 RECID=2 STAMP=926802386
input archived log thread=1 sequence=496 RECID=3 STAMP=926802389
input archived log thread=1 sequence=497 RECID=4 STAMP=926802693
input archived log thread=1 sequence=498 RECID=5 STAMP=926802693
input archived log thread=1 sequence=499 RECID=6 STAMP=926802696
input archived log thread=1 sequence=500 RECID=7 STAMP=926802787
input archived log thread=1 sequence=501 RECID=8 STAMP=926802789
input archived log thread=1 sequence=502 RECID=9 STAMP=926802792
input archived log thread=1 sequence=503 RECID=10 STAMP=926802793
input archived log thread=1 sequence=504 RECID=11 STAMP=926802812
input archived log thread=1 sequence=505 RECID=12 STAMP=926802813
input archived log thread=1 sequence=506 RECID=13 STAMP=926802816
input archived log thread=1 sequence=507 RECID=14 STAMP=926803076
input archived log thread=1 sequence=508 RECID=15 STAMP=926803077
channel ORA_DISK_1: starting piece 1 at 01-NOV-16
channel ORA_DISK_1: finished piece 1 at 01-NOV-16
piece handle=/home/oracle/arch_20161101_05rjrp45_1_1.rman tag=TAG20161101T211757 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
channel ORA_DISK_1: deleting archived log(s)
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_494_d1k4tkot_.arc RECID=1 STAMP=926802386
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_495_d1k4tln7_.arc RECID=2 STAMP=926802386
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_496_d1k4tot5_.arc RECID=3 STAMP=926802389
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_497_d1k544w3_.arc RECID=4 STAMP=926802693
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_498_d1k545wc_.arc RECID=5 STAMP=926802693
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_499_d1k548bm_.arc RECID=6 STAMP=926802696
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_500_d1k5734v_.arc RECID=7 STAMP=926802787
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_501_d1k5752s_.arc RECID=8 STAMP=926802789
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_502_d1k578c2_.arc RECID=9 STAMP=926802792
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_503_d1k579hy_.arc RECID=10 STAMP=926802793
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_504_d1k57w6s_.arc RECID=11 STAMP=926802812
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_505_d1k57xj1_.arc RECID=12 STAMP=926802813
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_506_d1k580hj_.arc RECID=13 STAMP=926802816
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_507_d1k5j4q0_.arc RECID=14 STAMP=926803076
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_508_d1k5j4yq_.arc RECID=15 STAMP=926803077
Finished backup at 01-NOV-16
RMAN> backup  format '/home/oracle/ctl_%T_%U.rman' current controlfile;
Starting backup at 01-NOV-16
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
channel ORA_DISK_1: starting piece 1 at 01-NOV-16
channel ORA_DISK_1: finished piece 1 at 01-NOV-16
piece handle=/home/oracle/ctl_20161101_06rjrp75_1_1.rman tag=TAG20161101T211933 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 01-NOV-16

清理环境还原数据库

RMAN> shutdown immediate;
database closed
database dismounted
Oracle instance shut down
RMAN> exit
Recovery Manager complete.
[oracle@localhost .oradata]$ mv test  test_20161101
[oracle@localhost .oradata]$ mkdir test

还原数据库

[oracle@localhost ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on Tue Nov 1 21:21:09 2016
Copyright (c) 1982, 2009, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup nomount;
ORACLE instance started.
Total System Global Area 2421825536 bytes
Fixed Size                  2215744 bytes
Variable Size            1795162304 bytes
Database Buffers          603979776 bytes
Redo Buffers               20467712 bytes
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@localhost ~]$ rman target /
Recovery Manager: Release 11.2.0.1.0 - Production on Tue Nov 1 21:21:22 2016
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
connected to target database: TEST (not mounted)
RMAN> restore controlfile from '/home/oracle/ctl_20161101_06rjrp75_1_1.rman';
Starting restore at 01-NOV-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=63 device type=DISK
channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
output file name=/home/oracle/.oradata/test/control01.ctl
Finished restore at 01-NOV-16
RMAN> alter database mount;
database mounted
released channel: ORA_DISK_1
RMAN> restore database;
Starting restore at 01-NOV-16
Starting implicit crosscheck backup at 01-NOV-16
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=63 device type=DISK
Crosschecked 3 objects
Finished implicit crosscheck backup at 01-NOV-16
Starting implicit crosscheck copy at 01-NOV-16
using channel ORA_DISK_1
Finished implicit crosscheck copy at 01-NOV-16
searching for all files in the recovery area
cataloging files...
no files cataloged
using channel ORA_DISK_1
datafile 5 not processed because file is offline
datafile 6 not processed because file is offline
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to /home/oracle/.oradata/test/system01.dbf
channel ORA_DISK_1: restoring datafile 00002 to /home/oracle/.oradata/test/sysaux01.dbf
channel ORA_DISK_1: restoring datafile 00003 to /home/oracle/.oradata/test/undotbs01.dbf
channel ORA_DISK_1: restoring datafile 00004 to /home/oracle/.oradata/test/users01.dbf
channel ORA_DISK_1: restoring datafile 00007 to /home/oracle/.oradata/test/readonly01.dbf
channel ORA_DISK_1: restoring datafile 00008 to /home/oracle/.oradata/test/readonly02.dbf
channel ORA_DISK_1: reading from backup piece /home/oracle/full_03rjrp0t_1_1.rman
channel ORA_DISK_1: piece handle=/home/oracle/full_03rjrp0t_1_1.rman tag=TAG20161101T211613
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:15
Finished restore at 01-NOV-16
RMAN> exit
Recovery Manager complete.

检查数据库文件
read_only1


通过Oracle Database Recovery Check很明显发现,这里看到文件状态是read only的.
recover database

[oracle@localhost ~]$ rman target /
Recovery Manager: Release 11.2.0.1.0 - Production on Tue Nov 1 21:28:14 2016
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
connected to target database: TEST (DBID=2210907828, not open)
RMAN> recover database;
Starting recover at 01-NOV-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=63 device type=DISK
datafile 7 not processed because file is read-only    <<<<=====注意
datafile 8 not processed because file is read-only    <<<<=====注意
starting media recovery
channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=507
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=508
channel ORA_DISK_1: reading from backup piece /home/oracle/arch_20161101_05rjrp45_1_1.rman
channel ORA_DISK_1: piece handle=/home/oracle/arch_20161101_05rjrp45_1_1.rman tag=TAG20161101T211757
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_507_d1k63pj8_.arc thread=1 sequence=507
channel default: deleting archived log(s)
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_507_d1k63pj8_.arc RECID=16 STAMP=926803702
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_508_d1k63pww_.arc thread=1 sequence=508
channel default: deleting archived log(s)
archived log file name=/opt/oracle/flash_recovery_area/TEST/archivelog/2016_11_01/o1_mf_1_508_d1k63pww_.arc RECID=17 STAMP=926803702
unable to find archived log
archived log thread=1 sequence=509
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 11/01/2016 21:28:24
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 509 and starting SCN of 8933048
RMAN> exit

resetlogs 打开数据库

SQL> set numw 16
SQL> SELECT status,
checkpoint_change#,
  2    3  checkpoint_time,last_change#,
count(*) ROW_NUM
FROM v$datafile
GROUP BY status, checkpoint_change#, checkpoint_time,last_change#
ORDER BY status, checkpoint_change#, checkpoint_time;  4    5    6    7
STATUS  CHECKPOINT_CHANGE# CHECKPOIN     LAST_CHANGE#          ROW_NUM
------- ------------------ --------- ---------------- ----------------
ONLINE             8932792 01-NOV-16                                 2
ONLINE             8933048 01-NOV-16                                 3
SYSTEM             8933048 01-NOV-16                                 1
SQL> SELECT status,
  2  checkpoint_change#,
  3  checkpoint_time,FUZZY,
  4  count(*) ROW_NUM
  5  FROM v$datafile_header
GROUP BY status, checkpoint_change#, checkpoint_time,fuzzy
ORDER BY status, checkpoint_change#, checkpoint_time;
  6    7
STATUS  CHECKPOINT_CHANGE# CHECKPOIN FUZ          ROW_NUM
------- ------------------ --------- --- ----------------
ONLINE             8932792 01-NOV-16 NO                 2
ONLINE             8933048 01-NOV-16 NO                 4
SQL> alter database open resetlogs;
Database altered.

alert日志信息

Tue Nov 01 21:29:56 2016
alter database open resetlogs
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_ora_14479.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/home/oracle/.oradata/test/redo01.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_ora_14479.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/home/oracle/.oradata/test/redo01.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_ora_14479.trc:
ORA-00313: open failed for members of log group 2 of thread 1
ORA-00312: online log 2 thread 1: '/home/oracle/.oradata/test/redo02.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_ora_14479.trc:
ORA-00313: open failed for members of log group 2 of thread 1
ORA-00312: online log 2 thread 1: '/home/oracle/.oradata/test/redo02.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_ora_14479.trc:
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/home/oracle/.oradata/test/redo03.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_ora_14479.trc:
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/home/oracle/.oradata/test/redo03.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
RESETLOGS after incomplete recovery UNTIL CHANGE 8933048
Resetting resetlogs activation ID 2210869684 (0x83c731b4)
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_ora_14479.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/home/oracle/.oradata/test/redo01.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_ora_14479.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/home/oracle/.oradata/test/redo01.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_ora_14479.trc:
ORA-00313: open failed for members of log group 2 of thread 1
ORA-00312: online log 2 thread 1: '/home/oracle/.oradata/test/redo02.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_ora_14479.trc:
ORA-00313: open failed for members of log group 2 of thread 1
ORA-00312: online log 2 thread 1: '/home/oracle/.oradata/test/redo02.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_ora_14479.trc:
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/home/oracle/.oradata/test/redo03.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_ora_14479.trc:
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/home/oracle/.oradata/test/redo03.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Tue Nov 01 21:29:58 2016
Setting recovery target incarnation to 3
Tue Nov 01 21:29:58 2016
Assigning activation ID 2224900353 (0x849d4901)
LGWR: STARTING ARCH PROCESSES
Tue Nov 01 21:29:58 2016
ARC0 started with pid=20, OS id=14486
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Tue Nov 01 21:29:59 2016
ARC1 started with pid=21, OS id=14488
Tue Nov 01 21:29:59 2016
ARC2 started with pid=22, OS id=14490
ARC1: Archival started
Tue Nov 01 21:29:59 2016
ARC3 started with pid=23, OS id=14492
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /home/oracle/.oradata/test/redo01.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Nov 01 21:29:59 2016
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 2.
Dictionary check beginning
File #7 is offline, but is part of an online tablespace.
data file 7: '/home/oracle/.oradata/test/readonly01.dbf'
Successfuly brought file #7 online.
File #8 is offline, but is part of an online tablespace.
data file 8: '/home/oracle/.oradata/test/readonly02.dbf'
Successfuly brought file #8 online.
Tue Nov 01 21:29:59 2016
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_dbw0_14226.trc:
ORA-01157: cannot identify/lock data file 201 - see DBWR trace file
ORA-01110: data file 201: '/home/oracle/.oradata/test/temp01.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /opt/oracle/diag/rdbms/test/test/trace/test_dbw0_14226.trc:
ORA-01186: file 201 failed verification tests
ORA-01157: cannot identify/lock data file 201 - see DBWR trace file
ORA-01110: data file 201: '/home/oracle/.oradata/test/temp01.dbf'
File 201 not verified due to error ORA-01157
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Re-creating tempfile /home/oracle/.oradata/test/temp01.dbf
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Tue Nov 01 21:30:00 2016
QMNC started with pid=24, OS id=14494
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Completed: alter database open resetlogs

这里看到了我们期待的警告
File #7 is offline, but is part of an online tablespace.
data file 7: ‘/home/oracle/.oradata/test/readonly01.dbf’
Successfuly brought file #7 online.
File #8 is offline, but is part of an online tablespace.
data file 8: ‘/home/oracle/.oradata/test/readonly02.dbf’
Successfuly brought file #8 online.
结论:如果数据库的表空间之前是read only,然后resetlogs操作就会有类似提示(File #N is offline, but is part of an online tablespace).这样的整个恢复过程不影响read only表空间中的数据

ORA-27300 ORA-27301 ORA-27302

有朋友库中alert日志报ORA-27300 ORA-27301 ORA-27302错误,让帮忙分析原因
ORA-27300 ORA-27301 ORA-27302

Fri Oct 28 22:15:33 2016
Process startup failed, error stack:
Errors in file d:\app\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_psp0_4856.trc:
ORA-27300: OS system dependent operation:CreateThread failed with status: 8
ORA-27301: OS failure message: 存储空间不足,无法处理此命令。
ORA-27302: failure occurred at: ssthrddcr
Fri Oct 28 22:15:34 2016
Process J001 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file d:\app\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_cjq0_5664.trc:
Fri Oct 28 22:16:35 2016
Process startup failed, error stack:
Errors in file d:\app\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_psp0_4856.trc:
ORA-27300: OS system dependent operation:CreateThread failed with status: 8
ORA-27301: OS failure message: 存储空间不足,无法处理此命令。
ORA-27302: failure occurred at: ssthrddcr
Fri Oct 28 22:16:36 2016
Process J001 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file d:\app\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_cjq0_5664.trc:
Fri Oct 28 22:17:36 2016
Process startup failed, error stack:
Errors in file d:\app\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_psp0_4856.trc:
ORA-27300: OS system dependent operation:CreateThread failed with status: 8
ORA-27301: OS failure message: 存储空间不足,无法处理此命令。
ORA-27302: failure occurred at: ssthrddcr

系统版本win 2008 32位
os


系统状态
os2


数据库版本
version


数据库内存配置
db-memory


新会话无法连接

28-10月-2016 22:46:59 * (CONNECT_DATA=(SID=xifenfei)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=Administrator))) * (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.91.5)(PORT=54383)) * establish * xifenfei * 12518
TNS-12518: TNS: 监听程序无法分发客户机连接
 TNS-12560: TNS: 协议适配器错误
28-10月-2016 22:46:59 * (CONNECT_DATA=(SID=xifenfei)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=Administrator))) * (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.91.5)(PORT=54384)) * establish * xifenfei * 12518
TNS-12518: TNS: 监听程序无法分发客户机连接
 TNS-12560: TNS: 协议适配器错误
28-10月-2016 22:46:59 * (CONNECT_DATA=(SID=xifenfei)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=Administrator))) * (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.91.5)(PORT=54385)) * establish * xifenfei * 12518
TNS-12518: TNS: 监听程序无法分发客户机连接
 TNS-12560: TNS: 协议适配器错误
28-10月-2016 22:46:59 * (CONNECT_DATA=(SID=xifenfei)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=Administrator))) * (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.91.5)(PORT=54386)) * establish * xifenfei * 12518
TNS-12518: TNS: 监听程序无法分发客户机连接
 TNS-12560: TNS: 协议适配器错误
28-10月-2016 22:46:59 * (CONNECT_DATA=(SID=xifenfei)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=Administrator))) * (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.91.5)(PORT=54388)) * establish * xifenfei * 12518
TNS-12518: TNS: 监听程序无法分发客户机连接
 TNS-12560: TNS: 协议适配器错误
28-10月-2016 22:46:59 * (CONNECT_DATA=(SID=xifenfei)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=Administrator))) * (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.91.5)(PORT=54389)) * establish * xifenfei * 12518
TNS-12518: TNS: 监听程序无法分发客户机连接
 TNS-12560: TNS: 协议适配器错误
28-10月-2016 22:46:59 * (CONNECT_DATA=(SID=xifenfei)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=Administrator))) * (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.91.5)(PORT=54390)) * establish * xifenfei * 12518
TNS-12518: TNS: 监听程序无法分发客户机连接
 TNS-12560: TNS: 协议适配器错误
28-10月-2016 22:47:00 * (CONNECT_DATA=(SID=xifenfei)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=Administrator))) * (ADDRESS=(PROTOCOL=tcp)(HOST=172.16.91.5)(PORT=54391)) * establish * xifenfei * 12518
TNS-12518: TNS: 监听程序无法分发客户机连接
 TNS-12560: TNS: 协议适配器错误

20161028224020


故障处理

HKEY_LOCAL_MACHINE SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters
Value Name: IRPStackSize
Data Type: DWORD
The default value of the IRPStackSize parameter is 15. The range is from 11 (0xb hexadecimal) through 50 (0x32 hexadecimal).

register


参考:Ora-27300: OS System Dependent Operation:Createthread Failed With Status: 8 (Doc ID 1310453.1)

ORA-600 kcrfr_update_nab_2 故障恢复

由于控制器掉线导致数据库启动报ORA-600 kcrfr_update_nab_2错误,导致无法正常open
数据库版本信息

ORACLE V10.2.0.4.0 - 64bit Production vsnsta=0
vsnsql=14 vsnxtr=3
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Windows Server 2003 Version V5.2 Service Pack 2
CPU                 : 12 - type 8664, 2 Physical Cores
Process Affinity    : 0x0000000000000000
Memory (Avail/Total): Ph:22579M/32754M, Ph+PgF:24594M/33845M

ORA-600 kcrfr_update_nab_2报错

Mon Oct 24 17:42:57 2016
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Mon Oct 24 17:42:58 2016
ALTER DATABASE OPEN
Mon Oct 24 17:43:14 2016
Beginning crash recovery of 1 threads
 parallel recovery started with 11 processes
Mon Oct 24 17:43:14 2016
Started redo scan
Mon Oct 24 17:43:16 2016
Errors in file d:\oracle\product\10.2.0\admin\spcsjkdb\udump\spcsjkdb_ora_10108.trc:
ORA-00600: internal error code, arguments: [kcrfr_update_nab_2], [0x7FFC22A2150], [2], [], [], [], [], []
Mon Oct 24 17:43:18 2016
Aborting crash recovery due to error 600
Mon Oct 24 17:43:18 2016
Errors in file d:\oracle\product\10.2.0\admin\spcsjkdb\udump\spcsjkdb_ora_10108.trc:
ORA-00600: internal error code, arguments: [kcrfr_update_nab_2], [0x7FFC22A2150], [2], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...

trace文件信息

*** 2016-10-24 17:43:14.515
*** ACTION NAME:() 2016-10-24 17:43:14.515
*** MODULE NAME:(sqlplus.exe) 2016-10-24 17:43:14.515
*** SERVICE NAME:() 2016-10-24 17:43:14.515
*** SESSION ID:(356.3) 2016-10-24 17:43:14.515
Successfully allocated 11 recovery slaves
Using 101 overflow buffers per recovery slave
Thread 1 checkpoint: logseq 33251, block 2, scn 14624215134369
  cache-low rba: logseq 33251, block 2463324
    on-disk rba: logseq 33251, block 2803965, scn 14624216078841
  start recovery at logseq 33251, block 2463324, scn 0
*** 2016-10-24 17:43:16.406
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kcrfr_update_nab_2], [0x7FFC22A2150], [2], [], [], [], [], []
Current SQL statement for this session:
ALTER DATABASE OPEN
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp+663           CALL???  ksedst+55            003C878B8 000000000 012B863E8
                                                   000000000
ksfdmp+19            CALL???  ksedmp+663           000000003 015572A70 007222698
                                                   003CACC80
kgerinv+158          CALL???  ksfdmp+19            015572430 000000000 0FFFFFFFF
                                                   000000000
kgeasnmierr+62       CALL???  kgerinv+158          000000000 000000000 000000000
                                                   004FD788F
kcrfr_update_nab+18  CALL???  kgeasnmierr+62       00BDA1170 000000000 000000000
6                                                  000000002
kcrfr_read+1078      CALL???  kcrfr_update_nab+18  007222698 00001E650 015572430
                              6                    0072229B8
kcrfrgv+8134         CALL???  kcrfr_read+1078      000000000 0051525D7 000000000
                                                   0051525D7
kcratr1+488          CALL???  kcrfrgv+8134         007222698 000000000 000000000
                                                   000000000
kcratr+412           CALL???  kcratr1+488          012B891C8 012B890A4 00727FFB8
                                                   00BEA7FF0
kctrec+1910          CALL???  kcratr+412           012B891C8 012B91E18 000000000
                                                   012B91E48
kcvcrv+3585          CALL???  kctrec+1910          012B92C58 000000000 00726DF00
                                                   00726BDB0
kcfopd+1007          CALL???  kcvcrv+3585          012B93350 000000000 000000000
                                                   000000000
adbdrv+55820         CALL???  kcfopd+1007          000000000 000000000 000000000
                                                   000000000
opiexe+13897         CALL???  adbdrv+55820         000000023 000000003 000000102
                                                   000000000
opiosq0+3558         CALL???  opiexe+13897         000000004 000000000 012B9B238
                                                   4155474E414C5F45
kpooprx+339          CALL???  opiosq0+3558         000000003 00000000E 012B9B3C8
                                                   0000000A4
kpoal8+894           CALL???  kpooprx+339          015587550 000000018 0041AE700
                                                   000000001
opiodr+1136          CALL???  kpoal8+894           00000005E 000000017 012B9E868
                                                   0072F5100
ttcpip+5146          CALL???  opiodr+1136          00000005E 000000017 012B9E868
                                                   2D8C00000000
opitsk+1818          CALL???  ttcpip+5146          015587550 000000000 000000000
                                                   000000000
opiino+1129          CALL???  opitsk+1818          00000001E 000000000 000000000
                                                   000000000
opiodr+1136          CALL???  opiino+1129          00000003C 000000004 012B9FB20
                                                   000000000
opidrv+815           CALL???  opiodr+1136          00000003C 000000004 012B9FB20
                                                   000000000
sou2o+52             CALL???  opidrv+815           00000003C 000000004 012B9FB20
                                                   7FF7FC48580
opimai_real+131      CALL???  sou2o+52             000000000 012B9FC40
                                                   7FFFFF7F258 077EF4D1C
opimai+96            CALL???  opimai_real+131      7FF7FC48580 7FFFFF7E000
                                                   0001F0003 000000000
OracleThreadStart+6  CALL???  opimai+96            012B9FEF0 01289FF3C 012B9FCC0
40                                                 7FF7FC48580
0000000077D6B6DA     CALL???  OracleThreadStart+6  01289FF3C 000000000 000000000
                              40                   012B9FFA8

官方描述
The assert ORA-600: [kcrfr_update_nab_2] is a direct result of a lost write in the current on line log that we are attempting to resolve.So, this confirms the theory that this is a OS/hardware lost write issue not an internal oracle bug. In fact the assert ORA-600: [kcrfr_update_nab_2] is how we detect a lost log write.
Bug 5692594
Hdr: 5692594 10.2.0.1 RDBMS 10.2.0.1 RECOVERY PRODID-5 PORTID-226 ORA-600
Abstract: AFTER DATABASE CRASHED DOESN’T OPEN ORA-600 [KCRFR_UPDATE_NAB_2]
Status: 95,Closed, Vendor OS Problem
Bug 6655116
Hdr: 6655116 10.2.0.3 RDBMS 10.2.0.3 RECOVERY PRODID-5 PORTID-23
Abstract: INSTANCES CRASH WITH ORA-600 [KCRFR_UPDATE_NAB_2] AFTER DISK FAILURE
根据官方的描述,结合故障情况,基本上可以确定是由于硬件异常导致Oracle写丢失,从而除非oracle相关bug导致数据库无法正常启动

ORA-600 [kcrfr_update_nab_2] [a] [b]
VERSIONS:
versions 10.2 to 11.1
DESCRIPTION:
Failure of upgrade of recovery node (RN) enqueue to SSX mode
ARGUMENTS:
Arg [a] State Object for redo nab enqueue for resilvering
Arg [b] Redo nab enqueue mode
FUNCTIONALITY:
Kernel Cache Redo File Read
IMPACT:
INSTANCE FAILURE

处理方法
1.如果有备份,利用备份进行不完全恢复,跳过最后异常的redo,数据库resetlogs打开
2.如果没有备份,尝试使用历史的控制文件进行不完全恢复,或者直接跳过数据库一致性打开库.
3.互联网有人解决删除redo第二组成员数据库open成功(http://blog.itpub.net/16976507/viewspace-1266952/)
redo


存储精简卷导致asm磁盘组异常

有朋友在一个存储空间给asm使用,发生空间不足,然后使用另外一个存储中的lun给asm的data磁盘组增加asm disk,运行了大概1天之后,asm磁盘组直接dismount,数据库crash.然后就无法正常mount.包括这个存储上的几个其他磁盘组也无法正常mount.
数据库异常日志

Sun Oct 23 08:43:59 2016
SUCCESS: diskgroup DATA was dismounted
SUCCESS: diskgroup DATA was dismounted
Sun Oct 23 08:44:00 2016
Errors in file /oracle/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_lmon_79128.trc:
ORA-00202: control file: '+DATA/orcl/controlfile/current.278.892363163'
ORA-15078: ASM diskgroup was forcibly dismounted
Sun Oct 23 08:44:00 2016
Errors in file /oracle/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_lgwr_79174.trc:
ORA-00345: redo log write error block 15924 count 2
ORA-00312: online log 2 thread 1: '+DATA/orcl/onlinelog/group_2.274.892363167'
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-15078: ASM diskgroup was forcibly dismounted
Errors in file /oracle/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_lgwr_79174.trc:
ORA-00202: control file: '+DATA/orcl/controlfile/current.278.892363163'
ORA-15078: ASM diskgroup was forcibly dismounted
Errors in file /oracle/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_lgwr_79174.trc:
ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '+DATA/orcl/controlfile/current.278.892363163'
ORA-15078: ASM diskgroup was forcibly dismounted
Sun Oct 23 08:44:00 2016
LGWR (ospid: 79174): terminating the instance due to error 204
Sun Oct 23 08:44:00 2016
opiodr aborting process unknown ospid (79742) as a result of ORA-1092
Sun Oct 23 08:44:01 2016
ORA-1092 : opitsk aborting process
Sun Oct 23 08:44:01 2016
ORA-1092 : opitsk aborting process
System state dump requested by (instance=1, osid=79174 (LGWR)), summary=[abnormal instance termination].
System State dumped to trace file /oracle/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_diag_79118.trc
Instance terminated by LGWR, pid = 79174

很明显,数据库异常是由于asm diskgroup dismount,因此分析asm 日志

asm 日志

Sun Oct 23 07:00:31 2016
Time drift detected. Please check VKTM trace file for more details.
Sun Oct 23 08:43:55 2016
Errors in file /oracle/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_8755.trc:
ORA-27061: waiting for async I/Os failed
Linux-x86_64 Error: 5: Input/output error
Additional information: -1
Additional information: 1048576
WARNING: Write Failed. group:1 disk:2 AU:1222738 offset:0 size:1048576
ERROR: failed to copy file +DATA.524, extent 15030
ERROR: ORA-15080 thrown in ARB0 for group number 1
Errors in file /oracle/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_8755.trc:
ORA-15080: synchronous I/O operation to a disk failed
Sun Oct 23 08:43:55 2016
NOTE: stopping process ARB0
NOTE: rebalance interrupted for group 1/0xec689cdd (DATA)
NOTE: ASM did background COD recovery for group 1/0xec689cdd (DATA)
NOTE: starting rebalance of group 1/0xec689cdd (DATA) at power 1
Starting background process ARB0
Sun Oct 23 08:43:56 2016
ARB0 started with pid=24, OS id=103554
NOTE: assigning ARB0 to group 1/0xec689cdd (DATA) with 1 parallel I/O
Errors in file /oracle/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_103554.trc:
ORA-27061: waiting for async I/Os failed
Linux-x86_64 Error: 5: Input/output error
Additional information: -1
Additional information: 1048576
WARNING: Write Failed. group:1 disk:2 AU:1222738 offset:0 size:1048576
ERROR: failed to copy file +DATA.256, extent 6570
ERROR: ORA-15080 thrown in ARB0 for group number 1
Errors in file /oracle/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_103554.trc:
ORA-15080: synchronous I/O operation to a disk failed
NOTE: stopping process ARB0
Sun Oct 23 08:43:58 2016
Errors in file /oracle/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_dbw0_8521.trc:
ORA-27061: waiting for async I/Os failed
Linux-x86_64 Error: 5: Input/output error
Additional information: -1
Additional information: 4096
WARNING: Write Failed. group:1 disk:3 AU:6789 offset:24576 size:4096
NOTE: cache initiating offline of disk 3 group DATA
NOTE: process _dbw0_+asm1 (8521) initiating offline of disk 3.3915934787 (DATA_0003) with mask 0x7e in group 1
Sun Oct 23 08:43:58 2016
WARNING: Disk 3 (DATA_0003) in group 1 mode 0x7f is now being offlined
WARNING: Disk 3 (DATA_0003) in group 1 in mode 0x7f is now being taken offline on ASM inst 1
NOTE: initiating PST update: grp = 1, dsk = 3/0xe9686c43, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 14 for pid 14, osid 8521
ERROR: Disk 3 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 1)
Sun Oct 23 08:43:58 2016
NOTE: cache dismounting (not clean) group 1/0xEC689CDD (DATA)
NOTE: messaging CKPT to quiesce pins Unix process pid: 103577, image: oracle@node1 (B000)
WARNING: Disk 3 (DATA_0003) in group 1 mode 0x7f offline is being aborted
WARNING: Offline of disk 3 (DATA_0003) in group 1 and mode 0x7f failed on ASM inst 1
NOTE: halting all I/Os to diskgroup 1 (DATA)
Sun Oct 23 08:43:59 2016
NOTE: LGWR doing non-clean dismount of group 1 (DATA)
NOTE: LGWR sync ABA=160.10145 last written ABA 160.10145

错误信息很明显,由于Write Failed导致asm diskgroup dismount.

系统日志

Oct 23 08:43:55 node1 kernel: sd 6:0:12:1: [sdd]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 23 08:43:55 node1 kernel: sd 6:0:12:1: [sdd]  Sense Key : Data Protect [current]
Oct 23 08:43:55 node1 kernel: sd 6:0:12:1: [sdd]  Add. Sense: Space allocation failed write protect
Oct 23 08:43:55 node1 kernel: sd 6:0:12:1: [sdd] CDB: Write(16): 8a 00 00 00 00 02 e7 18 37 f9 00 00 00 07 00 00
Oct 23 08:43:55 node1 kernel: end_request: critical space allocation error, dev sdd, sector 12467058681
Oct 23 08:43:55 node1 kernel: end_request: critical space allocation error, dev dm-3, sector 12467058681
Oct 23 08:43:55 node1 kernel: sd 8:0:6:1: [sdh]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 23 08:43:55 node1 kernel: sd 8:0:6:1: [sdh]  Sense Key : Data Protect [current]
Oct 23 08:43:55 node1 kernel: sd 8:0:6:1: [sdh]  Add. Sense: Space allocation failed write protect
Oct 23 08:43:55 node1 kernel: sd 8:0:6:1: [sdh] CDB: Write(16): 8a 00 00 00 00 02 e7 18
Oct 23 08:43:55 node1 kernel: sd 6:0:4:1: [sdb]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 23 08:43:55 node1 kernel: sd 6:0:4:1: [sdb]  Sense Key : Data Protect [current]
Oct 23 08:43:55 node1 kernel: sd 6:0:4:1: [sdb]   33Add. Sense: Space allocation failed write protect
Oct 23 08:43:55 node1 kernel: sd 6:0:4:1: [sdb] CDB: Write(16): 8a 00 00 00 00 02 e7 18 30 00 00 00 03 f9 00 00
Oct 23 08:43:55 node1 kernel: end_request: critical space allocation error, dev sdb, sector 12467056640
Oct 23 08:43:55 node1 kernel: f9 00 00 04 00
Oct 23 08:43:55 node1 kernel: end_request: critical space allocation error, dev dm-3, sector 12467056640
Oct 23 08:43:55 node1 kernel: 00 00
Oct 23 08:43:55 node1 kernel: end_request: critical space allocation error, dev sdh, sector 12467057657
Oct 23 08:43:55 node1 kernel: end_request: critical space allocation error, dev dm-3, sector 12467057657
Oct 23 08:43:57 node1 kernel: sd 8:0:6:1: [sdh]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 23 08:43:57 node1 kernel: sd 8:0:6:1: [sdh]  Sense Key : Data Protect [current]
Oct 23 08:43:57 node1 kernel: sd 8:0:6:1: [sdh]  Add. Sense: Space allocation failed write protect
Oct 23 08:43:57 node1 kernel: sd 8:0:6:1: [sdh] CDB: Write(16): 8a 00 00 00 00 02 e7 18 37 f9 00 00 00 07 00 00
Oct 23 08:43:57 node1 kernel: end_request: critical space allocation error, dev sdh, sector 12467058681
Oct 23 08:43:57 node1 kernel: end_request: critical space allocation error, dev dm-3, sector 12467058681
Oct 23 08:43:57 node1 kernel: sd 8:0:12:1: [sdj]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 23 08:43:57 node1 kernel: sd 8:0:12:1: [sdj]  Sense Key : Data Protect [current]
Oct 23 08:43:57 node1 kernel: sd 8:0:12:1: [sdj]  Add. Sense: Space allocation failed write protect
Oct 23 08:43:57 node1 kernel: sd 8:0:12:1: [sdj] CDB: Write(16): 8a 00 00 00 00 02 e7 18 30 00 00 00 03 f9 00 00
Oct 23 08:43:57 node1 kernel: end_request: critical space allocation error, dev sdj, sector 12467056640
Oct 23 08:43:57 node1 kernel: end_request: critical space allocation error, dev dm-3, sector 12467056640
Oct 23 08:43:57 node1 kernel: sd 6:0:4:1: [sdb]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 23 08:43:57 node1 kernel: sd 6:0:4:1: [sdb]  Sense Key : Data Protect [current]
Oct 23 08:43:57 node1 kernel: sd 6:0:4:1: [sdb]  Add. Sense: Space allocation failed write protect
Oct 23 08:43:57 node1 kernel: sd 6:0:4:1: [sdb] CDB: Write(16): 8a 00 00 00 00 02 e7 18 33 f9 00 00 04 00 00 00
Oct 23 08:43:58 node1 kernel: sd 6:0:4:1: [sdb]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 23 08:43:58 node1 kernel: sd 6:0:4:1: [sdb]  Sense Key : Data Protect [current]
Oct 23 08:43:58 node1 kernel: sd 6:0:4:1: [sdb]  Add. Sense: Space allocation failed write protect
Oct 23 08:43:58 node1 kernel: sd 6:0:4:1: [sdb] CDB: Write(16): 8a 00 00 00 00 03 3b 7e 78 30 00 00 00 08 00 00
Oct 23 10:50:59 node1 init: oracle-ohasd main process (6150) killed by TERM signal

错误信息为:critical space allocation error,严重空间分配错误.也就是linux在分配空间之时发生错误.在换而言之,由于分配空间错误导致asm 磁盘组dismount.

查看多路径信息

[root@node1 ~]# multipath -ll
36000d31003190c000000000000000003 dm-3 COMPELNT,Compellent Vol
size=80T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 6:0:9:1  sdd 8:48  active ready running
  `- 8:0:9:1  sdi 8:128 active ready running
delldisk2 (36000d310031908000000000000000003) dm-4 COMPELNT,Compellent Vol
size=8.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 6:0:12:1 sde 8:64  active ready running
  |- 8:0:6:1  sdh 8:112 active ready running
  |- 6:0:4:1  sdb 8:16  active ready running
  `- 8:0:12:1 sdj 8:144 active ready running
delldisk1 (36000d31003190a000000000000000007) dm-2 COMPELNT,Compellent Vol
size=12T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 6:0:1:1  sda 8:0   active ready running
  |- 8:0:2:1  sdf 8:80  active ready running
  |- 6:0:7:1  sdc 8:32  active ready running
  `- 8:0:3:1  sdg 8:96  active ready running

很明显报错的都是同一个lun(delldisk2),也就是存储空间使用完的存储.也就是说,由于delldisk2存储的空间使用尽了导致系统出现分配空间错误,从而导致asm 写失败,进而导致数据库异常.这种问题的本质其实就是存储给系统分配了8T,但是实际存储可以使用的空间不足8T,而os按照8T来使用从而出现该问题.专业名字叫做”存储精简卷”.因此各位在存储配置之时需要注意该问题.因为这种情况的出现一般只是写io异常,读依旧正常,因此不会丢失数据.

ORA-20011 KUP-11024错误分析

数据库alert日志出现ORA-20011 KUP-11024等错误

Thu Sep 22 18:00:31 2016
DBMS_STATS: GATHER_STATS_JOB encountered errors.  Check the trace file.
Errors in file /u1/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_j002_2686.trc:
ORA-20011: Approximate NDV failed: ORA-29913: error in executing ODCIEXTTABLEOPEN callout
KUP-11024: This external table can only be accessed from within a Data Pump job.

从报错的信息看应该是数据库收集统计信息报错(GATHER_STATS_JOB),但是报错原因是由于访问外部表导致,而该外部表很可能和data pump有关系.

查看trace日志

[oracle@xifenfei]$ more /u1/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_j002_2686.trc
Trace file /u1/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_j002_2686.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u1/oracle/pruduct/11.2.0.3
System name:    Linux
Node name:      xifenfei
Release:        2.6.32-220.el6.x86_64
Version:        #1 SMP Wed Nov 9 08:03:13 EST 2011
Machine:        x86_64
Instance name: xifenfei
Redo thread mounted by this instance: 1
Oracle process number: 356
Unix process pid: 2686, image: oracle@xifenfei (J002)
*** 2016-09-22 18:00:31.939
*** SESSION ID:(835.16363) 2016-09-22 18:00:31.939
*** CLIENT ID:() 2016-09-22 18:00:31.939
*** SERVICE NAME:(SYS$USERS) 2016-09-22 18:00:31.939
*** MODULE NAME:(DBMS_SCHEDULER) 2016-09-22 18:00:31.939
*** ACTION NAME:(ORA$AT_OS_OPT_SY_10669) 2016-09-22 18:00:31.939
ORA-20011: Approximate NDV failed: ORA-29913: error in executing ODCIEXTTABLEOPEN callout
KUP-11024: This external table can only be accessed from within a Data Pump job.
*** 2016-09-22 18:00:31.939
DBMS_STATS: GATHER_STATS_JOB: GATHER_TABLE_STATS('"DWDBA"','"ET$012D00070001"','""', ...)
DBMS_STATS: ORA-20011: Approximate NDV failed: ORA-29913: error in executing ODCIEXTTABLEOPEN callout
KUP-11024: This external table can only be accessed from within a Data Pump job.
*** 2016-09-22 18:00:31.960
DBMS_STATS: GATHER_STATS_JOB: GATHER_TABLE_STATS('"DWDBA"','"ET$01D10D4F0001"','""', ...)
DBMS_STATS: ORA-20011: Approximate NDV failed: ORA-29913: error in executing ODCIEXTTABLEOPEN callout
KUP-11024: This external table can only be accessed from within a Data Pump job.

通过trace文件,我们已经可以明确是由于数据库对DWDBA.ET$012D00070001和DWDBA.ET$01D10D4F0001这两个表收集统计信息时候报的上述alert日志中看到的错误.

查询数据库记录

SYS@xifenfei>select OWNER,OBJECT_NAME,OBJECT_TYPE, status,
  2  to_char(CREATED,'dd-mon-yyyy hh24:mi:ss') created
  3  ,to_char(LAST_DDL_TIME , 'dd-mon-yyyy hh24:mi:ss') last_ddl_time
  4  from dba_objects
  5  where object_name like 'ET$%'
  6  /
OWNER     OBJECT_NAME      OBJECT_TYPE  STATUS  CREATED               LAST_DDL_TIME
--------- ---------------- ------------ ------- ------------------------- ----------------
DWDBA     ET$012D00070001  TABLE        VALID   10-mar-2016 16:32:25  10-mar-2016 16:32:25
DWDBA     ET$01D10D4F0001  TABLE        VALID   10-mar-2016 17:29:29  10-mar-2016 17:29:29
SYS@xifenfei> select owner, TABLE_NAME, DEFAULT_DIRECTORY_NAME, ACCESS_TYPE
  2  from dba_external_tables
  3  order by 1,2
  4  /
OWNER       TABLE_NAME                     DEFAULT_DIRECTORY_NAME         ACCESS_
----------- ------------------------------ ------------------------------ -------
DWDBA       ET$012D00070001                EXP_FILE_DIR                   CLOB
DWDBA       ET$01D10D4F0001                EXP_FILE_DIR                   CLOB

到这一步,我们已经完全清楚,ET$012D00070001和ET$01D10D4F0001是两个外部表,由于他们的存在使得收集统计信息异常。

分析ET$012D00070001表

SYS@xifenfei>desc DWDBA.ET$012D00070001
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- ------------------------------------
 STORE_NO                                                       NUMBER(3)
 ITEM_NO                                                        NUMBER(6)
 WORK_DATE                                                      DATE
 DIVISION_NO                                                    NUMBER(2)
 SECTION_NO                                                     NUMBER(3)
 SUP_NO                                                         NUMBER(6)
 GRP_NO                                                         NUMBER(3)
 SUBGRP_NO                                                      NUMBER(3)
 USR                                                            VARCHAR2(30)
 TYPE                                                           NUMBER(1)
 ACTIVE_SELL_PRICE                                              NUMBER(8,2)
 SELL_PRICE                                                     NUMBER(8,2)
 SELL_VAT                                                       NUMBER(1)
 BUY_PRICE                                                      NUMBER(10,4)
 BUY_VAT                                                        NUMBER(1)
 PROMOTION_NO                                                   NUMBER(10)
 PROM_CLASS                                                     VARCHAR2(1)
 PROM_LEVEL                                                     NUMBER(1)
 STOCK                                                          NUMBER(10,3)
 STOCK_ADJ                                                      NUMBER(10,3)
 RECPT                                                          NUMBER(10,3)
 SALES                                                          NUMBER(10,3)
 STOCK_ADJ_AMNT                                                 NUMBER(10,2)
 RECPT_AMNT                                                     NUMBER(10,2)
 SALES_AMNT                                                     NUMBER(10,2)
 PROF_AMNT                                                      NUMBER(10,2)
 COST_CHANGE                                                    NUMBER(10,2)
 DISC                                                           NUMBER(10,3)
 RTN_QTY                                                        NUMBER(9,3)
 DISC_AMNT                                                      NUMBER(10,2)
 RTN_AMNT                                                       NUMBER(10,2)
 LOSS_AMNT                                                      NUMBER(10,2)
 CREATED_DATE                                                   DATE
 COST                                                           NUMBER(10,4)
 NBR_PK                                                         NUMBER(5)
 NBR_VISIT                                                      NUMBER(5)
 NBR_PK_LINE                                                    NUMBER(5)
 N_N_PROF_AMNT                                                  NUMBER(9,2)
 CON_FORE                                                       NUMBER(10,2)
 CON_FORE_OTH                                                   NUMBER(10,2)
 SALES_B                                                        NUMBER(10,3)
 SALES_AMNT_B                                                   NUMBER(10,2)
SYS@xifenfei>select count(*) from DWDBA.ET$012D00070001;
select count(*) from DWDBA.ET$012D00070001
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEOPEN callout
KUP-11024: This external table can only be accessed from within a Data Pump job.

通过对ET$012D00070001表查询重新了alert日志一样的错误,可以明确定位问题就是由于该外部表异常导致.通过查询mos,确定Bug 10327346 DBMS_WORKLOAD_CAPTURE does not drop external tables (causing ORA-20011 from DBMS_STATS)可能导致DBMS_WORKLOAD_CAPTURE无法正常清理掉Data pump的外部表导致出现Datapump出现孤立的外部表对象,从而出现该问题.解决方案就是直接drop 相关外部表.也就是这里的(ET$012D00070001和ET$01D10D4F0001)