主机断电系统回到N年前数据库报ORA-600 kcm_headroom_warn_1错误

主机断电导致系统时间回退到14年前,数据库启动报ORA-600[kcm_headroom_warn_1]错误

Sat Jun 21 17:49:12 2014   ---正常系统时间
Instance shutdown complete
Mon Aug 07 06:13:28 2000   ---重启后系统时间
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Initial number of CPU is 64
Effective number of CPU for internal database sizing is 32
Number of processor cores in the system is 8
Number of processor sockets in the system is 1
CELL communication is configured to use 0 interface(s):
CELL IP affinity details:
    NUMA status: non-NUMA system
    cellaffinity.ora status: N/A
CELL communication will use 1 IP group(s):
    Grp 0:
Picked latch-free SCN scheme 3
Autotune of undo retention is turned on.
IMODE=BR
ILAT =264
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
ORACLE_HOME = /ora1/prod/db/tech_st/11.2.0
System name:	SunOS
Node name:	erpdb1-boot
Release:	5.10
Version:	Generic_147147-26
Machine:	sun4v
Using parameter settings in server-side spfile /ora1/prod/db/tech_st/11.2.0/dbs/spfileprod.ora
System parameters with non-default values:
  processes                = 1200
  sessions                 = 2400
  timed_statistics         = TRUE
  event                    = ""
  shared_pool_size         = 448M
  shared_pool_reserved_size= 322122547
  nls_language             = "american"
  nls_territory            = "america"
  nls_sort                 = "binary"
  nls_date_format          = "DD-MON-RR"
  nls_numeric_characters   = ".,"
  nls_comp                 = "binary"
  nls_length_semantics     = "BYTE"
  sga_target               = 10G
  control_files            = "/data1/prod/db/apps_st/data/cntrl01.dbf"
  control_files            = "/data1/prod/db/apps_st/data/cntrl02.dbf"
  control_files            = "/data1/prod/db/apps_st/data/cntrl03.dbf"
  db_block_checksum        = "TRUE"
  db_block_size            = 8192
  compatible               = "11.1.0"
  log_archive_dest_1       = "location=/arch1/prod/arch"
  log_archive_format       = "prod_%t_%s_%r.arc"
  log_buffer               = 10485760
  log_checkpoint_interval  = 100000
  log_checkpoint_timeout   = 1200
  db_files                 = 512
  log_checkpoints_to_alert = TRUE
  dml_locks                = 10000
  undo_management          = "AUTO"
  undo_tablespace          = "APPS_UNDOTS1"
  db_block_checking        = "FALSE"
  _disable_fast_validate   = TRUE
  sec_case_sensitive_logon = FALSE
  session_cached_cursors   = 500
  utl_file_dir             = "/usr/tmp"
  plsql_code_type          = "INTERPRETED"
  plsql_optimize_level     = 2
  job_queue_processes      = 10
  _system_trig_enabled     = TRUE
  cursor_sharing           = "EXACT"
  parallel_min_servers     = 0
  parallel_max_servers     = 8
  audit_file_dest          = "/ora1/prod/db/tech_st/admin/prod/adump"
  db_name                  = "prod"
  open_cursors             = 3600
  _sort_elimination_cost_ratio= 5
  _b_tree_bitmap_plans     = FALSE
  _fast_full_scan_enabled  = FALSE
  query_rewrite_enabled    = "true"
  _like_with_bind_as_equality= TRUE
  pga_aggregate_target     = 2G
  workarea_size_policy     = "AUTO"
  _optimizer_autostats_job = FALSE
  optimizer_secure_view_merging= FALSE
  aq_tm_processes          = 4
  olap_page_pool_size      = 4M
  diagnostic_dest          = "/ora1/prod/db/tech_st/11.2.0/admin/prod_erpdb1"
  _trace_files_public      = TRUE
  max_dump_file_size       = "20480"
Mon Aug 07 06:13:30 2000
PMON started with pid=2, OS id=3556
Mon Aug 07 06:13:30 2000
PSP0 started with pid=3, OS id=3557
Mon Aug 07 06:13:31 2000
VKTM started with pid=4, OS id=3558 at elevated priority
VKTM running at (10)millisec precision with DBRM quantum (100)ms
Mon Aug 07 06:13:31 2000
GEN0 started with pid=5, OS id=3562
Mon Aug 07 06:13:32 2000
DIAG started with pid=6, OS id=3564
Mon Aug 07 06:13:32 2000
DBRM started with pid=7, OS id=3565
Mon Aug 07 06:13:32 2000
DIA0 started with pid=8, OS id=3566
Mon Aug 07 06:13:32 2000
MMAN started with pid=9, OS id=3567
Mon Aug 07 06:13:32 2000
DBW0 started with pid=10, OS id=3568
Mon Aug 07 06:13:32 2000
DBW1 started with pid=11, OS id=3569
Mon Aug 07 06:13:32 2000
DBW2 started with pid=12, OS id=3570
Mon Aug 07 06:13:32 2000
DBW3 started with pid=13, OS id=3571
Mon Aug 07 06:13:32 2000
LGWR started with pid=14, OS id=3572 at elevated priority
Mon Aug 07 06:13:32 2000
CKPT started with pid=15, OS id=3575
Mon Aug 07 06:13:32 2000
SMON started with pid=16, OS id=3576
Mon Aug 07 06:13:32 2000
RECO started with pid=17, OS id=3577
Mon Aug 07 06:13:32 2000
MMON started with pid=18, OS id=3578
Mon Aug 07 06:13:32 2000
MMNL started with pid=19, OS id=3579
ORACLE_BASE not set in environment. It is recommended
that ORACLE_BASE be set in the environment
Reusing ORACLE_BASE from an earlier startup = /ora1/prod/db/tech_st
Mon Aug 07 06:13:32 2000
ALTER DATABASE   MOUNT
Successful mount of redo thread 1, with mount id 4111810188
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Mon Aug 07 06:13:36 2000
ALTER DATABASE OPEN
************************************************************
Warning: The SCN headroom for this database is only -51464 hours!
************************************************************
Errors in file /ora1/prod/db/tech_st/11.2.0/admin/prod_erpdb1/diag/rdbms/prod/prod/trace/prod_ora_3583.trc  (incident=441878):
ORA-00600: internal error code, arguments: [kcm_headroom_warn_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /ora1/prod/db/tech_st/11.2.0/admin/prod_erpdb1/diag/rdbms/prod/prod/incident/incdir_441878/prod_ora_3583_i441878.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /ora1/prod/db/tech_st/11.2.0/admin/prod_erpdb1/diag/rdbms/prod/prod/trace/prod_ora_3583.trc:
ORA-00600: internal error code, arguments: [kcm_headroom_warn_1], [], [], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...
Dumping diagnostic data in directory=[cdmp_20000807061339], requested by (instance=1, osid=3583), summary=[incident=441878].
Mon Aug 07 06:14:35 2000
Sweep [inc][441878]: completed
Sweep [inc2][441878]: completed

在数据库出现莫名其妙问题不能启动之时,请注意主机时间,另外还有一例:记录一次ORA-00600[2252]故障解决

ORA-01157: cannot identify/lock data file导致表空间无法online

朋友和我聊到一个案例,一个表空间本来有一个数据文件(已经有大量数据进入),然后他加入7个数据文件,这个时候他发现他加多了文件,想删除这些数据文件,可是又不怎么懂数据库,直接offline 表空间,然后rm掉相关数据文件,最后表空间无法online.我对其场景进行了模拟恢复测试,使用bbed和dul对其进行恢复

模拟场景
表空间有一个数据文件,里面有数据,然后加入一个新数据文件,进入offline tbs,然后rm数据文件
offline表空间后,该表空间存在的数据文件也会offline

SQL> create tablespace xifenfei datafile
  2  '/u01/oracle/ora9i/oradata/xifenfei01.dbf' size 10M autoextend on next 16M;
Tablespace created.
SQL> create table chf.t_xifenfei01 tablespace xifenfei
  2  as select * from dba_objects;
Table created.
SQL> create table chf.t_xifenfei02 tablespace xifenfei
  2  as select * from dba_tables;
Table created.
SQL> SELECT COUNT(*) FROM CHF.T_XIFENFEI01;
  COUNT(*)
----------
     30758
SQL> C/01/02
  1* SELECT COUNT(*) FROM CHF.T_XIFENFEI02
SQL> /
  COUNT(*)
----------
       865
SQL> select tablespace_name,segment_name from dba_segments
  2  where segment_name like 'T_XIFENFEI%';
TABLESPACE_NAME                SEGMENT_NAME
------------------------------ -------------------------
XIFENFEI                       T_XIFENFEI01
XIFENFEI                       T_XIFENFEI02
SQL> ALTER tablespace xifenfei add datafile
  2  '/u01/oracle/ora9i/oradata/xifenfei02.dbf' size 16m;
Tablespace altered.
SQL> alter tablespace xifenfei offline;
Tablespace altered.
SQL> select file#,status$ from file$;
     FILE#    STATUS$
---------- ----------
         1          2
         2          2
         3          2
         4          2
         5          2
         6          2
         7          2
         8          2
         9          2
        10          2
        11          2
        12          2
12 rows selected.
SQL> set pages 1000
SQL> /
     FILE#    STATUS$
---------- ----------
         1          2
         2          2
         3          2
         4          2
         5          2
         6          2
         7          2
         8          2
         9          2
        10          2
        11          2
        12          2
12 rows selected.
SQL> select file#,status from v$datafile;
     FILE# STATUS
---------- -------
         1 SYSTEM
         2 ONLINE
         3 ONLINE
         4 ONLINE
         5 ONLINE
         6 ONLINE
         7 ONLINE
         8 ONLINE
         9 ONLINE
        10 ONLINE
        11 OFFLINE
        12 OFFLINE
12 rows selected.
SQL> select file#,status from v$datafile_header;
     FILE# STATUS
---------- -------
         1 ONLINE
         2 ONLINE
         3 ONLINE
         4 ONLINE
         5 ONLINE
         6 ONLINE
         7 ONLINE
         8 ONLINE
         9 ONLINE
        10 ONLINE
        11 OFFLINE
        12 OFFLINE
12 rows selected.
SQL> select tablespace_name,status from dba_tablespaces;
TABLESPACE_NAME                STATUS
------------------------------ ---------
SYSTEM                         ONLINE
UNDOTBS1                       ONLINE
TEMP                           ONLINE
CWMLITE                        ONLINE
DRSYS                          ONLINE
EXAMPLE                        ONLINE
INDX                           ONLINE
ODM                            ONLINE
TOOLS                          ONLINE
USERS                          ONLINE
XDB                            ONLINE
XIFENFEI                       OFFLINE
12 rows selected.
SQL> select ts#,name,online$ from ts$;
       TS# NAME                              ONLINE$
---------- ------------------------------ ----------
         0 SYSTEM                                  1
         1 UNDOTBS1                                1
         2 TEMP                                    1
         3 CWMLITE                                 1
         4 DRSYS                                   1
         5 EXAMPLE                                 1
         6 INDX                                    1
         7 ODM                                     1
         8 TOOLS                                   1
         9 USERS                                   1
        10 XDB                                     1
        11 UNDOTBS2                                3
        12 XIFENFEI                                2
13 rows selected.
SQL> col name for a50
SQL> select file#,name from v$datafile;
     FILE# NAME
---------- --------------------------------------------------
         1 /u01/oracle/ora9i/oradata/ora9i/system01.dbf
         2 /u01/oracle/ora9i/oradata/ora9i/undotbs01.dbf
         3 /u01/oracle/ora9i/oradata/ora9i/cwmlite01.dbf
         4 /u01/oracle/ora9i/oradata/ora9i/drsys01.dbf
         5 /u01/oracle/ora9i/oradata/ora9i/example01.dbf
         6 /u01/oracle/ora9i/oradata/ora9i/indx01.dbf
         7 /u01/oracle/ora9i/oradata/ora9i/odm01.dbf
         8 /u01/oracle/ora9i/oradata/ora9i/tools01.dbf
         9 /u01/oracle/ora9i/oradata/ora9i/users01.dbf
        10 /u01/oracle/ora9i/oradata/ora9i/xdb01.dbf
        11 /u01/oracle/ora9i/oradata/xifenfei01.dbf
        12 /u01/oracle/ora9i/oradata/xifenfei02.dbf
12 rows selected.
SQL> !rm /u01/oracle/ora9i/oradata/xifenfei02.dbf
SQL> !ls -l /u01/oracle/ora9i/oradata/xifenfei*
-rw-r-----  1 ora9i oinstall 10493952 Dec 25 20:19 /u01/oracle/ora9i/oradata/xifenfei01.dbf
SQL> alter tablespace xifenfei online;
alter tablespace xifenfei online
*
ERROR at line 1:
ORA-01157: cannot identify/lock data file 12 - see DBWR trace file
ORA-01110: data file 12: '/u01/oracle/ora9i/oradata/xifenfei02.dbf'

这里已经构造出来,由于由于数据文件丢失导致表空间online报ORA-01157: cannot identify/lock data file的错误

online存在数据文件
存在的数据文件online之后,表空间依然offline

SQL> alter database datafile 11 online;
Database altered.
SQL> select file#,status from v$datafile_header;
     FILE# STATUS
---------- -------
         1 ONLINE
         2 ONLINE
         3 ONLINE
         4 ONLINE
         5 ONLINE
         6 ONLINE
         7 ONLINE
         8 ONLINE
         9 ONLINE
        10 ONLINE
        11 ONLINE
        12 OFFLINE
12 rows selected.
SQL> select file#,status from v$datafile;
     FILE# STATUS
---------- -------
         1 SYSTEM
         2 ONLINE
         3 ONLINE
         4 ONLINE
         5 ONLINE
         6 ONLINE
         7 ONLINE
         8 ONLINE
         9 ONLINE
        10 ONLINE
        11 ONLINE
        12 OFFLINE
12 rows selected.
SQL> select tablespace_name,status from dba_tablespaces;
TABLESPACE_NAME                STATUS
------------------------------ ---------
SYSTEM                         ONLINE
UNDOTBS1                       ONLINE
TEMP                           ONLINE
CWMLITE                        ONLINE
DRSYS                          ONLINE
EXAMPLE                        ONLINE
INDX                           ONLINE
ODM                            ONLINE
TOOLS                          ONLINE
USERS                          ONLINE
XDB                            ONLINE
XIFENFEI                       OFFLINE
12 rows selected.
SQL> select ts#,name,online$ from ts$;
       TS# NAME                                                  ONLINE$
---------- -------------------------------------------------- ----------
         0 SYSTEM                                                      1
         1 UNDOTBS1                                                    1
         2 TEMP                                                        1
         3 CWMLITE                                                     1
         4 DRSYS                                                       1
         5 EXAMPLE                                                     1
         6 INDX                                                        1
         7 ODM                                                         1
         8 TOOLS                                                       1
         9 USERS                                                       1
        10 XDB                                                         1
        11 UNDOTBS2                                                    3
        12 XIFENFEI                                                    2
13 rows selected.
SQL> create table chf.t_1 tablespace xifenfei
  2  as select * from dual;
as select * from dual
                 *
ERROR at line 2:
ORA-01542: tablespace 'XIFENFEI' is offline, cannot allocate space in it

由于表空间无法online,因此对该表空间分配报ORA-01542错误

使用bbed构造数据文件欺骗数据库

[ora9i@xifenfei tmp]$ dd if=/dev/zero of=/u01/oracle/ora9i/oradata/xifenfei02.dbf
> bs=8192 count=2049
2049+0 records in
2049+0 records out
BBED> copy file 11 block 1 to file 12 block 1
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:    0 to  511           Dba:0x03000001
------------------------------------------------------------------------
 0b020000 0100c002 00000000 00000104 1deb0000 00002009 00000008 329bf29e
 4f524139 49000000 11010000 00050000 00200000 0b000300 00000000 00000000
 <32 bytes per line>
BBED> set count 32
        COUNT           32
BBED> m /x 0c
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:  280 to  311           Dba:0x03000001
------------------------------------------------------------------------
 0c000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> m /x 0c
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:   52 to   83           Dba:0x03000001
------------------------------------------------------------------------
 0c000300 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> sum apply
Check value for File 12, Block 1:
current = 0xad21, required = 0xad21
SQL> alter tablespace xifenfei online;
alter tablespace xifenfei online
*
ERROR at line 1:
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: '/u01/oracle/ora9i/oradata/xifenfei02.dbf'
ORA-01251: Unknown File Header Version read for file number 12
SQL> !oerr ora 01251
01251, 00000, "Unknown File Header Version read for file number %s"
// *Cause:  Read of the file header returned a record but its version cannot
//         be identified.  Either the header has been corrupted, or the file
//         is not a valid database file.
// *Action: Have the operating system make the correct file available to
//         the database, or recover the file.
BBED> m /x 01000003
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:    4 to  36           Dba:0x03000001
------------------------------------------------------------------------
 01000003 00000000 00000104 21ad0000 00002009 00000008 329bf29e 4f524139
 <32 bytes per line>
BBED> sum apply
Check value for File 12, Block 1:
current = 0xace1, required = 0xace1
SQL> alter tablespace xifenfei online;
alter tablespace xifenfei online
*
ERROR at line 1:
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: '/u01/oracle/ora9i/oradata/xifenfei02.dbf'
ORA-01203: wrong incarnation of this file - wrong creation SCN
BBED> m /x 613B0300
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:  100 to  131           Dba:0x03000001
------------------------------------------------------------------------
 613b0300 00000000 f35bb133 f39bae33 b46c0200 00000000 00000000 00000000
 <32 bytes per line>
BBED> sum apply
Check value for File 12, Block 1:
current = 0xa0c9, required = 0xa0c9
--提示数据文件scn比控制文件新(由于正常数据文件online过)
SQL> alter tablespace xifenfei online;
alter tablespace xifenfei online
*
ERROR at line 1:
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: '/u01/oracle/ora9i/oradata/xifenfei02.dbf'
ORA-01207: file is more recent than controlfile - old controlfile
SQL> set lines 160
SQL> col name for a50
SQL> select ts#,file#,BYTES/1024/1024/1024 status,enabled,
  2  to_char(checkpoint_change#,'999999999999999') "SCN",
  3  to_char(last_change#,'999999999999999')"STOP_SCN",
  4  name from v$datafile;
  TS#  FILE#     STATUS ENABLED SCN      STOP_SCN         NAME
----- --- ---------- ---------- -------- ------------- ---------------------------------------------
    0   1  .37109375 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/system01.dbf
    1   2   .1953125 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/undotbs01.dbf
    3   3  .01953125 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/cwmlite01.dbf
    4   4  .01953125 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/drsys01.dbf
    5   5 .145874023 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/example01.dbf
    6   6 .024414063 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/indx01.dbf
    7   7  .01953125 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/odm01.dbf
    8   8 .009765625 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/tools01.dbf
    9   9 .024414063 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/users01.dbf
   10  10 .043945313 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/xdb01.dbf
   12  11 .009765625 DISABLED     228353        228353 /u01/oracle/ora9i/oradata/xifenfei01.dbf
   12  12          0 DISABLED     213917        213917 /u01/oracle/ora9i/oradata/xifenfei02.dbf
12 rows selected.
SQL> set pages 100
SQL> set linesize 150
SQL> select ts#,file#,TABLESPACE_NAME,status,
  2  to_char(CREATION_TIME,'yyyy-mm-dd hh24:mi:ss') CREATE_TIME,
  3  to_char(checkpoint_change#,'9999999999999999') "SCN",
  4  to_char(RESETLOGS_CHANGE#,'9999999999999999') "RESETLOGS SCN",FUZZY
from v$datafile_header;  5
TS# FILE# TABLESPACE_NAME  STATUS  CREATE_TIME         SCN       RESETLOGS SCN     FUZ
--- ----- ---------------- ------- ------------------- --------- ----------------- ---
  0     1 SYSTEM           ONLINE  2004-03-10 12:17:49    198231            158900 YES
  1     2 UNDOTBS1         ONLINE  2004-03-10 20:30:20    198231            158900 YES
  3     3 CWMLITE          ONLINE  2004-03-10 12:18:10    198231            158900 YES
  4     4 DRSYS            ONLINE  2004-03-10 12:18:11    198231            158900 YES
  5     5 EXAMPLE          ONLINE  2004-03-10 12:18:12    198231            158900 YES
  6     6 INDX             ONLINE  2004-03-10 12:18:16    198231            158900 YES
  7     7 ODM              ONLINE  2004-03-10 12:18:17    198231            158900 YES
  8     8 TOOLS            ONLINE  2004-03-10 12:18:19    198231            158900 YES
  9     9 USERS            ONLINE  2004-03-10 12:18:19    198231            158900 YES
 10    10 XDB              ONLINE  2004-03-10 12:18:20    198231            158900 YES
 12    11 XIFENFEI         OFFLINE 2014-12-25 17:57:07    228353            158900 NO
  0    12                  OFFLINE                             0                 0
12 rows selected.
BBED> m /x 9D430300
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:  140 to  171           Dba:0x03000001
------------------------------------------------------------------------
 9d430300 00000000 f670b133 01000000 02000000 fc4e0000 10000000 02000000
 <32 bytes per line>
BBED> SUM APPLY
Check value for File 12, Block 1:
current = 0x9982, required = 0x9982
SQL> ALTER TABLESPACE XIFENFEI ONLINE;
Tablespace altered.

在bbed的恢复过程中出现以下常见错误:
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: ‘/u01/oracle/ora9i/oradata/xifenfei02.dbf’
ORA-01251: Unknown File Header Version read for file number 12
这个主要是块号和块地址不一匹配,导致数据库无法识别

ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: ‘/u01/oracle/ora9i/oradata/xifenfei02.dbf’
ORA-01203: wrong incarnation of this file – wrong creation SCN
这个由于数据文件头的创建scn和控制文件以及file$中的记录不一致

ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: ‘/u01/oracle/ora9i/oradata/xifenfei02.dbf’
ORA-01207: file is more recent than controlfile – old controlfile
这个由于正常数据文件online过,或者你构造文件头使用的scn大于需要构造文件在控制文件中记录的scn值

后续处理

SQL> SELECT COUNT(*) FROM CHF.T_XIFENFEI01;
  COUNT(*)
----------
     30758
SQL> C/01/02
  1* SELECT COUNT(*) FROM CHF.T_XIFENFEI02
SQL> /
  COUNT(*)
----------
       865
SQL> ALTER DATABASE DATAFILE 12 OFFLINE DROP;
Database altered.

证明通过bbed,模拟数据文件,使用完美恢复数据内容

dul恢复

SQL> ALTER TABLESPACE XIFENFEI OFFLINE;
Tablespace altered.
[ora9i@xifenfei dul]$ ./dul
Data UnLoader: 10.2.0.5.33 - Internal Only - on Thu Dec 25 21:18:54 2014
with 64-bit io functions
Copyright (c) 1994 2014 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
Within one week you will need a more recent DUL version for this os
Found db_id = 2666699570
Found db_name = ORA9I
DUL> bootstrap;
Probing file = 1, block = 417
  database version 9 bootstrap$ at file 1, block 377
. unloading table                BOOTSTRAP$
DUL: Warning: block number is non zero but marked deferred trying to process it anyhow
      57 rows unloaded
DUL: Warning: Dictionary cache DC_BOOTSTRAP is empty
Reading BOOTSTRAP.dat 57 entries loaded
Parsing Bootstrap$ contents
Generating dict.ddl for version 9
 OBJ$: segobjno 18, file 1 block 121
 TAB$: segobjno 2, tabno 1, file 1  block 25
 COL$: segobjno 2, tabno 5, file 1  block 25
 USER$: segobjno 10, tabno 1, file 1  block 89
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$   31490 rows unloaded
. unloading table                      TAB$     909 rows unloaded
. unloading table                      COL$   35582 rows unloaded
. unloading table                     USER$      63 rows unloaded
Reading USER.dat 63 entries loaded
Reading OBJ.dat 31490 entries loaded and sorted 31490 entries
Reading TAB.dat 909 entries loaded
Reading COL.dat 35582 entries loaded and sorted 35582 entries
Reading BOOTSTRAP.dat 57 entries loaded
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 9
 OBJ$: segobjno 18, file 1 block 121
 TAB$: segobjno 2, tabno 1, file 1  block 25
 COL$: segobjno 2, tabno 5, file 1  block 25
 USER$: segobjno 10, tabno 1, file 1  block 89
 TABPART$: segobjno 230, file 1 block 1657
 INDPART$: segobjno 234, file 1 block 1689
 TABCOMPART$: segobjno 249, file 1 block 1809
 INDCOMPART$: segobjno 253, file 1 block 1841
 TABSUBPART$: segobjno 240, file 1 block 1737
 INDSUBPART$: segobjno 245, file 1 block 1777
 IND$: segobjno 2, tabno 3, file 1  block 25
 ICOL$: segobjno 2, tabno 4, file 1  block 25
 LOB$: segobjno 2, tabno 6, file 1  block 25
 COLTYPE$: segobjno 2, tabno 7, file 1  block 25
 TYPE$: segobjno 296, tabno 1, file 1  block 2129
 COLLECTION$: segobjno 296, tabno 2, file 1  block 2129
 ATTRIBUTE$: segobjno 296, tabno 3, file 1  block 2129
 LOBFRAG$: segobjno 258, file 1 block 1881
 LOBCOMPPART$: segobjno 261, file 1 block 1905
 UNDO$: segobjno 15, file 1 block 105
 TS$: segobjno 6, tabno 2, file 1  block 57
 PROPS$: segobjno 101, file 1 block 769
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$
DUL: Warning: Recreating file "OBJ.ctl"
   31490 rows unloaded
. unloading table                      TAB$
DUL: Warning: Recreating file "TAB.ctl"
     909 rows unloaded
. unloading table                      COL$
DUL: Warning: Recreating file "COL.ctl"
   35582 rows unloaded
. unloading table                     USER$
DUL: Warning: Recreating file "USER.ctl"
      63 rows unloaded
. unloading table                  TABPART$      55 rows unloaded
. unloading table                  INDPART$     128 rows unloaded
. unloading table               TABCOMPART$       0 rows unloaded
. unloading table               INDCOMPART$       0 rows unloaded
. unloading table               TABSUBPART$       0 rows unloaded
. unloading table               INDSUBPART$       0 rows unloaded
. unloading table                      IND$    1396 rows unloaded
. unloading table                     ICOL$    1842 rows unloaded
. unloading table                      LOB$     427 rows unloaded
. unloading table                  COLTYPE$    1160 rows unloaded
. unloading table                     TYPE$     956 rows unloaded
. unloading table               COLLECTION$     250 rows unloaded
. unloading table                ATTRIBUTE$    3623 rows unloaded
. unloading table                  LOBFRAG$       0 rows unloaded
. unloading table              LOBCOMPPART$       0 rows unloaded
. unloading table                     UNDO$      21 rows unloaded
. unloading table                       TS$      13 rows unloaded
. unloading table                    PROPS$      25 rows unloaded
Reading USER.dat 63 entries loaded
Reading OBJ.dat 31490 entries loaded and sorted 31490 entries
Reading TAB.dat 909 entries loaded
Reading COL.dat 35582 entries loaded and sorted 35582 entries
Reading TABPART.dat 55 entries loaded and sorted 55 entries
Reading TABCOMPART.dat 0 entries loaded and sorted 0 entries
Reading TABSUBPART.dat 0 entries loaded and sorted 0 entries
Reading INDPART.dat 128 entries loaded and sorted 128 entries
Reading INDCOMPART.dat 0 entries loaded and sorted 0 entries
Reading INDSUBPART.dat 0 entries loaded and sorted 0 entries
Reading IND.dat 1396 entries loaded
Reading LOB.dat 427 entries loaded
Reading ICOL.dat 1842 entries loaded
Reading COLTYPE.dat 1160 entries loaded
Reading TYPE.dat 956 entries loaded
Reading ATTRIBUTE.dat 3623 entries loaded
Reading COLLECTION.dat 250 entries loaded
Reading BOOTSTRAP.dat 57 entries loaded
Reading LOBFRAG.dat 0 entries loaded and sorted 0 entries
Reading LOBCOMPPART.dat 0 entries loaded and sorted 0 entries
Reading UNDO.dat 21 entries loaded
Reading TS.dat 13 entries loaded
Reading PROPS.dat 25 entries loaded
Database character set is ZHS16GBK
Database national character set is AL16UTF16
DUL> unload table chf.t_xifenfei01;
. unloading table              T_XIFENFEI01   30758 rows unloaded
DUL> unload table chf.t_xifenfei02;
. unloading table              T_XIFENFEI02     865 rows unloaded
DUL>

这里证明,在表空间不能online的情况下,dul处理类似故障很轻松(但是需要重建表空间,bbed处理的从原理上可以不重建表空间),当然如果涉及的对象很多,而丢失的文件很少,可能bbed也不失为一种很不错的选择.
另外补充:如果表空间丢失的文件包含数据,也可以通过类似bbed和dul方法处理:bbed 让表空间online之后需要考虑跳过丢失文件;dul处理完全相同

asm disk误设置pvid导致asm diskgroup无法mount恢复

有朋友找到我说他们把以前存储到AIX直连的存储切换为含光纤交换机的存储网络后,RAC无法启动,让我给予支持.通过分析是由于换链路之后开始磁盘顺序不对,维护人员对其asm disk 设置了pvid,导致asm 磁盘组无法正常mount,从而使得含votedisk的dg的asm disk无法正常访问,从而RAC的cssd进程无法启动,同样数据文件的磁盘组也无法mount,通过kfed修复成功,实现数据0丢失.
平台版本信息(2节点RAC)

$ sqlplus -v
SQL*Plus: Release 11.2.0.4.0 Production
$ uname -a
AIX db2 1 7 00F9733E4C00

GI日志报错信息

2014-12-20 16:44:08.769:
[ohasd(6946818)]CRS-2769:Unable to failover resource 'ora.diskmon'.
2014-12-20 16:44:11.775:
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;
Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-20 16:44:26.791:
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;
、Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-20 16:44:41.812:
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;
Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log

从这里可以看出来是由于RAC启动过程中无法获得votedisk使得其无法正常启动,通过分析日志找出来votedisk相关磁盘

2014-12-15 17:36:15.424:
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk4; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-15 17:36:15.433:
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk5; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-15 17:36:15.445:
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk6; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log

从这里可以知道rhdisk4,5,6为votedisk对应磁盘,使用kfed查看磁盘头信息

$kfed read /dev/rhdisk4
kfbh.endian:                        201 ; 0x000: 0xc9
kfbh.hard:                          194 ; 0x001: 0xc2
kfbh.type:                          212 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                        193 ; 0x003: 0xc1
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
1102BEE00 C9C2D4C1 00000000 00000000 00000000  [................]
1102BEE10 00000000 00000000 00000000 00000000  [................]
        Repeat 6 times
1102BEE80 00F9733D 67553E0A 00000000 00000000  [..s=gU>.........]
1102BEE90 00000000 00000000 00000000 00000000  [................]
  Repeat 246 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][212]
$kfed read /dev/rhdisk4 blkn=1
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            2 ; 0x002: KFBTYP_FREESPC
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                       1 ; 0x004: blk=1
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3883664132 ; 0x00c: 0xe77c0304
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdfsb.aunum:                         0 ; 0x000: 0x00000000
kfdfsb.max:                         254 ; 0x004: 0x00fe
kfdfsb.cnt:                          23 ; 0x006: 0x0017
kfdfsb.bound:                         0 ; 0x008: 0x0000
kfdfsb.flag:                          1 ; 0x00a: B=1
kfdfsb.ub1spare:                      0 ; 0x00b: 0x00
kfdfsb.spare[0]:                      0 ; 0x00c: 0x00000000
kfdfsb.spare[1]:                      0 ; 0x010: 0x00000000
kfdfsb.spare[2]:                      0 ; 0x014: 0x00000000
kfdfse[0].fse:                      119 ; 0x018: FREE=0x7 FRAG=0x7
kfdfse[1].fse:                       16 ; 0x019: FREE=0x0 FRAG=0x1
…………
$kfed read /dev/rhdisk4 blkn=510
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     254 ; 0x004: blk=254
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3460116983 ; 0x00c: 0xce3d31f7
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:         ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]:            0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]:            0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        2 ; 0x026: KFDGTP_NORMAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:                CRS_0000 ; 0x028: length=8
kfdhdb.grpname:                     CRS ; 0x048: length=3
kfdhdb.fgname:                 CRS_0000 ; 0x068: length=8
…………

由上述分析可以基本上确定是asm disk header 被破坏,进一步分析破坏原因

[db2/dev#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active
hdisk1          00f9733e21b953e6                    rootvg          active
hdisk2          00f9733e21b97a83                    appvg           active
hdisk3          00f9733e21b98434                    appvg           active
hdisk4          00f9733d67553e0a                    None
hdisk5          00f9733d67553f31                    None
hdisk6          00f9733d67554011                    None
hdisk7          00f9733d67554165                    None
hdisk8          00f9733d675541e5                    None
hdisk9          00f9733d675542e4                    None
hdisk10         none                                None
[db2/dev#]ls -l rhdisk*
crw-------    2 root     system       24,  1 Oct 18 11:45 rhdisk0
crw-------    1 root     system       24,  3 Oct 18 13:27 rhdisk1
crw-------    1 root     system       24,  5 Dec 20 20:02 rhdisk10
crw-------    1 root     system       24,  2 Oct 18 13:32 rhdisk2
crw-------    1 root     system       24,  0 Oct 18 13:32 rhdisk3
crw-rw----    1 grid     asmadmin     24,  8 Dec 20 20:02 rhdisk4
crw-rw----    1 grid     asmadmin     24,  9 Dec 20 20:02 rhdisk5
crw-rw----    1 grid     asmadmin     24, 10 Dec 20 20:02 rhdisk6
crw-rw----    1 grid     asmadmin     24,  4 Dec 20 20:02 rhdisk7
crw-rw----    1 grid     asmadmin     24,  6 Dec 20 20:02 rhdisk8
crw-rw----    1 grid     asmadmin     24,  7 Dec 20 20:02 rhdisk9

从这里基本上可以看出来,是由于磁盘头被重写了pvid,导致asm disk header 被破坏.进一步分析asm log,确定哪些磁盘被用作asm disk

SQL> CREATE DISKGROUP CRS NORMAL REDUNDANCY  DISK '/dev/rhdisk4',
'/dev/rhdisk5',
'/dev/rhdisk6' ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (1,0) to disk (/dev/rhdisk4)
NOTE: Assigning number (1,1) to disk (/dev/rhdisk5)
NOTE: Assigning number (1,2) to disk (/dev/rhdisk6)
NOTE: initializing header on grp 1 disk CRS_0000
NOTE: initializing header on grp 1 disk CRS_0001
NOTE: initializing header on grp 1 disk CRS_0002
SQL> CREATE DISKGROUP DATA EXTERNAL REDUNDANCY  DISK
'/dev/rhdisk9' SIZE 614400M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (2,0) to disk (/dev/rhdisk9)
NOTE: initializing header on grp 2 disk DATA_0000
SQL> CREATE DISKGROUP FBA EXTERNAL REDUNDANCY  DISK
'/dev/rhdisk8' SIZE 204800M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (3,0) to disk (/dev/rhdisk8)
NOTE: initializing header on grp 3 disk FBA_0000
SQL> CREATE DISKGROUP ARCH EXTERNAL REDUNDANCY  DISK
'/dev/rhdisk7' SIZE 102400M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (4,0) to disk (/dev/rhdisk7)
NOTE: initializing header on grp 4 disk ARCH_0000

这里可以确定asm disk为rhdisk[4-9],通过kfed分析全部和rhdisk4一样的问题,也符合lspv查询出来的结果,使用kfed repair修复asm disk header后

SQL> alter diskgroup data mount;
Diskgroup altered.
SQL> alter diskgroup fba mount;
Diskgroup altered.
SQL> alter diskgroup arch mount;
Diskgroup altered.
SQL> alter diskgroup crs mount;
Diskgroup altered.
SQL> select group_number,disk_number,path from v$asm_disk;
GROUP_NUMBER DISK_NUMBER PATH
------------ ----------- --------------------------------------------------
           2           0 /dev/rhdisk4
           2           1 /dev/rhdisk5
           2           2 /dev/rhdisk6
           1           0 /dev/rhdisk7
           4           0 /dev/rhdisk8
           3           0 /dev/rhdisk9
6 rows selected.
SQL> select group_number,name from v$asm_diskgroup;
GROUP_NUMBER NAME
------------ ------------------------------------------------------------
           1 ARCH
           2 CRS
           3 DATA
           4 FBA

这里证明通过kfed对磁盘头的修复,asm磁盘组已经全部mount成功,GI状态也恢复正常

[db2/#]crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.CRS.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.DATA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.FBA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.LISTENER.lsnr
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.asm
               ONLINE  ONLINE       db1                      Started
               ONLINE  ONLINE       db2                      Started
ora.gsd
               OFFLINE OFFLINE      db1
               OFFLINE OFFLINE      db2
ora.net1.network
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.ons
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.registry.acfs
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       db1
ora.cvu
      1        ONLINE  ONLINE       db1
ora.db1.vip
      1        ONLINE  ONLINE       db1
ora.db2.vip
      1        ONLINE  ONLINE       db2
ora.nkora.db
      1        ONLINE  ONLINE       db1                      Open
      2        ONLINE  ONLINE       db2                      Open
ora.oc4j
      1        ONLINE  ONLINE       db1
ora.scan1.vip
      1        ONLINE  ONLINE       db1

这里忽略了一个问题,在修复磁盘头之前没有清除pvid,导致磁盘头修复后,pvid依然存储在odm中

[db2/dev#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active
hdisk1          00f9733e21b953e6                    rootvg          active
hdisk2          00f9733e21b97a83                    appvg           active
hdisk3          00f9733e21b98434                    appvg           active
hdisk4          00f9733d67553e0a                    None
hdisk5          00f9733d67553f31                    None
hdisk6          00f9733d67554011                    None
hdisk7          00f9733d67554165                    None
hdisk8          00f9733d675541e5                    None
hdisk9          00f9733d675542e4                    None
hdisk10         none                                None

通过分析发现fba磁盘组中无任何记录,使用该磁盘组进行直接清除pvid测试

$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Sun Dec 21 03:13:31 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> alter diskgroup fba dismount;
Diskgroup altered.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
$ exit
You have mail in /usr/spool/mail/root
[db2/#]chdev -l hdisk8 -a pv=clear
hdisk8 changed
[db2/#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active
hdisk1          00f9733e21b953e6                    rootvg          active
hdisk2          00f9733e21b97a83                    appvg           active
hdisk3          00f9733e21b98434                    appvg           active
hdisk4          00f9733d67553e0a                    None
hdisk5          00f9733d67553f31                    None
hdisk6          00f9733d67554011                    None
hdisk7          00f9733d67554165                    None
hdisk8          none                                None
hdisk9          00f9733d675542e4                    None
hdisk10         none                                None
[db2/#]su - grid
$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Sun Dec 21 03:15:19 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> alter diskgroup fba mount;
Diskgroup altered.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

通过测试直接清除pvid asm 磁盘头依然工作正常,关闭GI,使用chdev清除hdisk[4-9]所有pvid,启动GI一切正常

[db1/#]crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.CRS.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.DATA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.FBA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.LISTENER.lsnr
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.asm
               ONLINE  ONLINE       db1                      Started
               ONLINE  ONLINE       db2                      Started
ora.gsd
               OFFLINE OFFLINE      db1
               OFFLINE OFFLINE      db2
ora.net1.network
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.ons
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.registry.acfs
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       db1
ora.cvu
      1        ONLINE  ONLINE       db1
ora.db1.vip
      1        ONLINE  ONLINE       db1
ora.db2.vip
      1        ONLINE  ONLINE       db2
ora.nkora.db
      1        ONLINE  ONLINE       db1                      Open
      2        ONLINE  ONLINE       db2                      Open
ora.oc4j
      1        ONLINE  ONLINE       db1
ora.scan1.vip
      1        ONLINE  ONLINE       db1
[db1/#]lspv
hdisk0          00f9733df7c7a9db                    rootvg          active
hdisk1          00f9733d21dad8fe                    rootvg          active
hdisk2          00f9733d21dbd08b                    appvg           active
hdisk3          00f9733d21dbd2ab                    appvg           active
hdisk4          none                                None
hdisk5          none                                None
hdisk6          none                                None
hdisk7          none                                None
hdisk8          none                                None
hdisk9          none                                None
hdisk10         none                                None

至此设置pvid导致asm disk header损坏的asm 恢复正常,实现数据0丢失。
温馨提示:aix asm disk磁盘中不能设置pvid,否则将会导致asm disk header 损坏,无法正常mount
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

hosts无效内容未注释导致RAC安装OUI的Network Interface Usage报INS-41112

安装AIX RAC过程中,在OUI的Network Interface Usage报INS-41112,从而使得安装无法继续
ins-41112
2


错误记录摘要

Cause - Installer has detected that network interface en6 does not maintain connectivity on all cluster nodes.
Action - Ensure that the chosen interface has been configured across all cluster nodes.  Additional Information:
Summary of the failed nodes xifenfei01  
- PRVF-4190 : Verification of the hosts config file failed

hosts解析配置

10.70.89.68     xifenfei01
10.70.89.69     xifenfei01-vip
10.70.89.100    xifenfei01-priv
10.70.89.71     xifenfei02
10.70.89.72     xifenfei02-vip
10.70.89.101    xifenfei02-priv
10.70.89.79     xifenfei-scan

网卡配置

xifenfei01:/u01/soft/grid> ifconfig -a
en7: flags=1e084863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 10.70.89.100 netmask 0xffffffe0 broadcast 10.70.89.127
         tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
en6: flags=1e084863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 10.70.89.68 netmask 0xffffffe0 broadcast 10.70.89.95
         tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
lo0: flags=e08084b,c0<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
        inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
        inet6 ::1%1/0
         tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
xifenfei02/#ifconfig -a
en6: flags=1e084863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 10.70.89.71 netmask 0xffffffe0 broadcast 10.70.89.95
         tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
en7: flags=1e084863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 10.70.89.101 netmask 0xffffffe0 broadcast 10.70.89.127
         tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
lo0: flags=e08084b,c0<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
        inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
        inet6 ::1%1/0
         tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
xifenfei01/asmdisks#netstat -in
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en7 1500 link#2 0.11.25.bd.b8.7a 107 0 132 2 0
en7 1500 10.70.89.96 10.70.89.100 107 0 132 2 0
en6 1500 link#3 0.11.25.bd.a8.93 50015 0 36963 2 0
en6 1500 10.70.89.64 10.70.89.68 50015 0 36963 2 0
lo0 16896 link#1 1589 0 1588 0 0
lo0 16896 127 127.0.0.1 1589 0 1588 0 0
lo0 16896 ::1%1 1589 0 1588 0 0
xifenfei02/asmdisks#netstat -in
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en6 1500 link#2 0.11.25.bd.a8.a9 5401 0 3660 2 0
en6 1500 10.70.89.64 10.70.89.71 5401 0 3660 2 0
en7 1500 link#3 0.11.25.bd.51.d2 129 0 123 2 0
en7 1500 10.70.89.96 10.70.89.101 129 0 123 2 0
lo0 16896 link#1 1249 0 1249 0 0
lo0 16896 127 127.0.0.1 1249 0 1249 0 0
lo0 16896 ::1%1 1249 0 1249 0 0

这里可以看到网卡和IP配置是匹配,但是两台主机网卡显示顺序不一样.查看EtherChannel配置,也是正常的
4
3


网络ping测试

xifenfei01:/u01/soft/grid> ping xifenfei01-priv
PING xifenfei01-priv: (10.70.89.100): 56 data bytes
64 bytes from 10.70.89.100: icmp_seq=0 ttl=255 time=0 ms
64 bytes from 10.70.89.100: icmp_seq=1 ttl=255 time=0 ms
xifenfei02/#ping xifenfei01-priv
PING xifenfei01-priv: (10.70.89.100): 56 data bytes
64 bytes from 10.70.89.100: icmp_seq=0 ttl=255 time=0 ms
64 bytes from 10.70.89.100: icmp_seq=1 ttl=255 time=0 ms

使用runcluvfy.sh检测网络配置

./runcluvfy.sh comp nodecon -i en7 -n xifenfei01-priv,xifenfei02-priv -verbose
xifenfei01:/u01/soft/grid> ./runcluvfy.sh comp nodecon -i en7 -n xifenfei01-priv,xifenfei02-priv -verbose
Verifying node connectivity
Checking node connectivity...
Checking hosts config file...
  Node Name     Status                    Comment
  ------------  ------------------------  ------------------------
  xifenfei02-priv  passed                    successful
  xifenfei01-priv  failed                    Invalid Entry
ERROR:
PRVF-4190 : Verification of the hosts config file failed
Interface information for node "xifenfei02-priv"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 en6    10.70.89.71     10.70.89.64     10.70.89.71     10.70.89.65     00:11:25:BD:A8:A9 1500
 en7    10.70.89.101    10.70.89.96     10.70.89.101    10.70.89.65     00:11:25:BD:51:D2 1500
Interface information for node "xifenfei01-priv"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 en7    10.70.89.100    10.70.89.96     10.70.89.100    10.70.89.65     00:11:25:BD:B8:7A 1500
 en6    10.70.89.68     10.70.89.64     10.70.89.68     10.70.89.65     00:11:25:BD:A8:93 1500
Check: Node connectivity for interface "en7"
  Source                          Destination                     Connected?
  ------------------------------  ------------------------------  ----------------
  xifenfei02-priv[10.70.89.101]     xifenfei01-priv[10.70.89.100]     yes
Result: Node connectivity passed for interface "en7"
Check: TCP connectivity of subnet "10.70.89.96"
  Source                          Destination                     Connected?
  ------------------------------  ------------------------------  ----------------
  xifenfei01:10.70.89.68            xifenfei02-priv:10.70.89.101      passed
  xifenfei01:10.70.89.68            xifenfei01-priv:10.70.89.100      passed
Result: TCP connectivity check passed for subnet "10.70.89.96"
Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "10.70.89.64".
Subnet mask consistency check passed for subnet "10.70.89.96".
Subnet mask consistency check passed.
Result: Node connectivity check failed
Verification of node connectivity was unsuccessful.
Checks did not pass for the following node(s):
        xifenfei01-priv

这里显示xifenfei01-priv主机检测失败,报错为PRVF-4190,检查xifenfei01主机的hosts文件发现一处错误记录

xifenfei01/#vi /etc/hosts
"/etc/hosts" 113 lines, 3556 characters
# @(#)47        1.1  src/bos/usr/sbin/netstart/hosts, cmdnet, bos530 7/24/91 10:
00:46
# IBM_PROLOG_BEGIN_TAG
# This is an automatically generated prolog.
#
# bos530 src/bos/usr/sbin/netstart/hosts 1.1
#
# Licensed Materials - Property of IBM
#
# (C) COPYRIGHT International Business Machines Corp. 1985,1989
# All Rights Reserved
#
# US Government Users Restricted Rights - Use, duplication or
# disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

这里发现”00:46″是一个新行,而且是无效记录,除掉该行记录,继续runcluvfy.sh测试

xifenfei01:/u01/soft/grid> ./runcluvfy.sh comp nodecon -i en7 -n xifenfei01-priv,xifenfei02-priv -verbose
Verifying node connectivity
Checking node connectivity...
Checking hosts config file...
  Node Name                             Status
  ------------------------------------  ------------------------
  xifenfei02-priv                         passed
  xifenfei01-priv                         passed
Verification of the hosts config file successful
Interface information for node "xifenfei02-priv"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 en6    10.70.89.71     10.70.89.64     10.70.89.71     10.70.89.65     00:11:25:BD:A8:A9 1500
 en7    10.70.89.101    10.70.89.96     10.70.89.101    10.70.89.65     00:11:25:BD:51:D2 1500
Interface information for node "xifenfei01-priv"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 en7    10.70.89.100    10.70.89.96     10.70.89.100    10.70.89.65     00:11:25:BD:B8:7A 1500
 en6    10.70.89.68     10.70.89.64     10.70.89.68     10.70.89.65     00:11:25:BD:A8:93 1500
Check: Node connectivity for interface "en7"
  Source                          Destination                     Connected?
  ------------------------------  ------------------------------  ----------------
  xifenfei02-priv[10.70.89.101]     xifenfei01-priv[10.70.89.100]     yes
Result: Node connectivity passed for interface "en7"
Check: TCP connectivity of subnet "10.70.89.96"
  Source                          Destination                     Connected?
  ------------------------------  ------------------------------  ----------------
  xifenfei01:10.70.89.68            xifenfei02-priv:10.70.89.101      passed
  xifenfei01:10.70.89.68            xifenfei01-priv:10.70.89.100      passed
Result: TCP connectivity check passed for subnet "10.70.89.96"
Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "10.70.89.64".
Subnet mask consistency check passed for subnet "10.70.89.96".
Subnet mask consistency check passed.
Result: Node connectivity check passed
Verification of node connectivity was successful.

除掉无效记录后,runcluvfy检查通过.OUI继续安装一切正常.
果然是由于/etc/hosts中出现无效记录,从而使得RAC安装检查无法通过,再次提醒各位安装RAC需要小心hosts文件
参考文档:PRVF-4190 Verification of the Hosts Config File Failed (Doc ID 1056025.1)
[INS-41112] Specified network interface doesnt maintain connectivity across cluster nodes. (Doc ID 1427202.1)

[MySQL异常恢复]使用工具直接抽取MySQL数据字典

熟悉Oracle的朋友都知道,当Oracle数据库无法正常启动之时,可以通过dul或者其他三方工具直接读取数据文件中数据,从而来抢救数据,减少损失,在mysql中如果使用了innodb引擎也可以在mysql数据库不启动启动下抽取相关记录.本文为抽取数据字典篇章,后续将继续提供drop恢复,truncate 恢复,delete恢复等mysql非常规恢复篇章.
创建一张get_dict测试表

mysql> use xifenfei;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+--------------------+
| Tables_in_xifenfei |
+--------------------+
| t_delete           |
+--------------------+
1 row in set (0.00 sec)
mysql> create table get_dict(id int not null primary key,name varchar(100));
Query OK, 0 rows affected (0.01 sec)
mysql> insert into get_dict value(1,'www.xifenfei.com');
Query OK, 1 row affected (0.00 sec)
mysql> insert into get_dict value(2,'www.xifenfei.com-xifenfei');
Query OK, 1 row affected (0.00 sec)
mysql> insert into get_dict value(3,'xifenfei-www.xifenfei.com');
Query OK, 1 row affected (0.00 sec)
mysql> show tables;
+--------------------+
| Tables_in_xifenfei |
+--------------------+
| get_dict           |
| t_delete           |
+--------------------+
2 rows in set (0.00 sec
mysql> select TABLE_NAME,TABLE_SCHEMA,TABLE_TYPE from information_schema.tables
    -> where table_name='get_dict';
+------------+--------------+------------+
| TABLE_NAME | TABLE_SCHEMA | TABLE_TYPE |
+------------+--------------+------------+
| get_dict   | xifenfei     | BASE TABLE |
+------------+--------------+------------+
1 row in set (0.01 sec)
mysql> select TABLE_NAME,NON_UNIQUE,TABLE_SCHEMA,INDEX_SCHEMA,INDEX_NAME,COLUMN_NAME from
    -> INFORMATION_SCHEMA.STATISTICS where  TABLE_NAME='get_dict';
+------------+------------+--------------+--------------+------------+-------------+
| TABLE_NAME | NON_UNIQUE | TABLE_SCHEMA | INDEX_SCHEMA | INDEX_NAME | COLUMN_NAME |
+------------+------------+--------------+--------------+------------+-------------+
| get_dict   |          0 | xifenfei     | xifenfei     | PRIMARY    | id          |
+------------+------------+--------------+--------------+------------+-------------+
1 row in set (0.00 sec)
mysql> select TABLE_SCHEMA,TABLE_NAME,COLUMN_NAME,ORDINAL_POSITION from
    -> information_schema.COLUMNS  where table_name='get_dict';
+--------------+------------+-------------+------------------+
| TABLE_SCHEMA | TABLE_NAME | COLUMN_NAME | ORDINAL_POSITION |
+--------------+------------+-------------+------------------+
| xifenfei     | get_dict   | id          |                1 |
| xifenfei     | get_dict   | name        |                2 |
+--------------+------------+-------------+------------------+
2 rows in set (0.01 sec)

关闭mysql数据库

[root@localhost recovery_mysql]# service mysql stop
Shutting down MySQL..[  OK  ]
[root@localhost recovery_mysql]# ps -ef|grep mysql
root     18876 15827  0 18:05 pts/1    00:00:00 grep mysql

使用工具解析innodb文件

[root@localhost recovery_mysql]# ./stream_parser -f /var/lib/mysql/ibdata1
Opening file: /var/lib/mysql/ibdata1
File information:
Opening file: /var/lib/mysql/ibdata1
File information:
ID of device containing file:         2054
inode number:                      1782889
ID of device containing file:         2054
protection:                         100660 inode number:                      1782889
(regular file)
protection:                         100660 number of hard links:                    1
(regular file)
user ID of owner:                      101
number of hard links:                    1
group ID of owner:                     102
user ID of owner:                      101
device ID (if special file):             0
group ID of owner:                     102
blocksize for filesystem I/O:         4096
device ID (if special file):             0
number of blocks allocated:          24616
blocksize for filesystem I/O:         4096
Opening file: /var/lib/mysql/ibdata1
number of blocks allocated:          24616
File information:
………………
user ID of owner:                      101
group ID of owner:                     102
device ID (if special file):             0
blocksize for filesystem I/O:         4096
ID of device containing file:         2054
number of blocks allocated:          24616
inode number:                      1782889
protection:                         100660 (regular file)
number of hard links:                    1
user ID of owner:                      101
group ID of owner:                     102
device ID (if special file):             0
blocksize for filesystem I/O:         4096
number of blocks allocated:          24616
time of last access:            1417922668 Sun Dec  7 11:24:28 2014
time of last modification:      1418294104 Thu Dec 11 18:35:04 2014
time of last status change:     1418294104 Thu Dec 11 18:35:04 2014
time of last access:            1417922668 Sun Dec  7 11:24:28 2014
total size, in bytes:             12582912 (12.000 MiB)
time of last modification:      1418294104 Thu Dec 11 18:35:04 2014
time of last status change:     1418294104 Thu Dec 11 18:35:04 2014
Size to process:                  12582912 (12.000 MiB)
total size, in bytes:             12582912 (12.000 MiB)
Size to process:                  12582912 (12.000 MiB)
All workers finished in 0 sec

主要文件介绍

[root@localhost recovery_mysql]# ls -l  pages-ibdata1/FIL_PAGE_INDEX/
total 1388
-rw-r--r-- 1 root root  16384 Dec 11 18:51 0000000000000001.page
-rw-r--r-- 1 root root  16384 Dec 11 18:51 0000000000000002.page
-rw-r--r-- 1 root root  49152 Dec 11 18:51 0000000000000003.page
-rw-r--r-- 1 root root  49152 Dec 11 18:51 0000000000000004.page
-rw-r--r-- 1 root root  16384 Dec 11 18:51 0000000000000005.page
-rw-r--r-- 1 root root 114688 Dec 11 18:51 0000000000000011.page
-rw-r--r-- 1 root root 114688 Dec 11 18:51 0000000000000012.page
-rw-r--r-- 1 root root 114688 Dec 11 18:51 0000000000000013.page
-rw-r--r-- 1 root root 114688 Dec 11 18:51 0000000000000014.page
-rw-r--r-- 1 root root 114688 Dec 11 18:51 0000000000000015.page
-rw-r--r-- 1 root root 147456 Dec 11 18:51 0000000000000016.page
-rw-r--r-- 1 root root  98304 Dec 11 18:51 0000000000000017.page
-rw-r--r-- 1 root root 114688 Dec 11 18:51 0000000000000018.page
-rw-r--r-- 1 root root  49152 Dec 11 18:51 0000000000000019.page
-rw-r--r-- 1 root root  49152 Dec 11 18:51 0000000000000020.page
-rw-r--r-- 1 root root  49152 Dec 11 18:51 0000000000000021.page
-rw-r--r-- 1 root root  65536 Dec 11 18:51 0000000000000025.page
-rw-r--r-- 1 root root  16384 Dec 11 18:51 18446744069414584320.page

0000000000000001.page主要是记录mysql中表信息文件
0000000000000002.page主要是记录mysql中的表的列的信息文件
0000000000000003.page主要是记录mysql中表的index信息文件

抽取table数据

[root@localhost recovery_mysql]# ./c_parser -4f pages-ibdata1/FIL_PAGE_INDEX/0000000000000001.page -t
dictionary/SYS_TABLES.sql > dumps/default/SYS_TABLES 2> dumps/default/SYS_TABLES.sql
[root@localhost recovery_mysql]# grep get dumps/default/SYS_TABLES | head -5
000000000D1D    95000001510110  SYS_TABLES      "xifenfei/get\_dict"    23      2       1       0       80      ""      9
[root@localhost recovery_mysql]# cat dumps/default/SYS_TABLES.sql
SET FOREIGN_KEY_CHECKS=0;
LOAD DATA LOCAL INFILE '/tmp/recovery_mysql/dumps/default/SYS_TABLES' REPLACE INTO TABLE `SYS_TABLES`
FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'SYS_TABLES\t'
 (`NAME`, `ID`, `N_COLS`, `TYPE`, `MIX_ID`, `MIX_LEN`, `CLUSTER_NAME`, `SPACE`);

抽取column数据

[root@localhost recovery_mysql]#  ./c_parser -4f pages-ibdata1/FIL_PAGE_INDEX/0000000000000002.page
 -t dictionary/SYS_COLUMNS.sql > dumps/default/SYS_COLUMNS 2> dumps/default/SYS_COLUMNS.sql
[root@localhost recovery_mysql]# cat dumps/default/SYS_COLUMNS
-- Page id: 10, Format: REDUNDANT, Records list: Valid, Expected records: (115 115)
000000000300    800000012D0123  SYS_COLUMNS     11      0       "ID"    1       4       0       0
000000000300    800000012D0138  SYS_COLUMNS     11      1       "FOR\_NAME"     1       4       0       0
…………
000000000D1D    95000001510129  SYS_COLUMNS     23      0       "id"    6       1283    4       0
000000000D1D    9500000151013E  SYS_COLUMNS     23      1       "name"  1       524303  100     0
-- Page id: 10, Found records: 115, Lost records: NO, Leaf page: YES
[root@localhost recovery_mysql]# more dumps/default/SYS_COLUMNS.sql
SET FOREIGN_KEY_CHECKS=0;
LOAD DATA LOCAL INFILE '/tmp/recovery_mysql/dumps/default/SYS_COLUMNS' REPLACE INTO TABLE
 `SYS_COLUMNS` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY
'"' LINES STARTING BY 'SYS_COLUMNS\t' (`TABLE_ID`, `POS`, `NAME`, `MTYPE`, `PRTYPE`, `LEN`, `PREC`);

抽取index数据

[root@localhost recovery_mysql]# ./c_parser -4f pages-ibdata1/FIL_PAGE_INDEX/0000000000000003.page
 -t dictionary/SYS_INDEXES.sql > dumps/default/SYS_INDEXES 2> dumps/default/SYS_INDEXES.sql
[root@localhost recovery_mysql]# more dumps/default/SYS_INDEXES.sql
SET FOREIGN_KEY_CHECKS=0;
LOAD DATA LOCAL INFILE '/tmp/recovery_mysql/dumps/default/SYS_INDEXES' REPLACE INTO TABLE
 `SYS_INDEXES` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'SYS_INDEXES\t'
(`TABLE_ID`, `ID`, `NAME`, `N_FIELDS`, `TYPE`, `SPACE`, `PAGE_NO`);
[root@localhost recovery_mysql]# more dumps/default/SYS_INDEXES
-- Page id: 11, Format: REDUNDANT, Records list: Valid, Expected records: (13 13)
000000000300    800000012D0177  SYS_INDEXES     11      11      "ID\_IND"       1       3       0       302
…………
000000000B02    820000013504C8  SYS_INDEXES     20      22      "GEN\_CLUST\_INDEX"     0       1       6       3
000000000D1D    9500000151016B  SYS_INDEXES     23      25      "PRIMARY"       1       3       9       3

启动mysql数据库

[root@localhost recovery_mysql]# service mysql start
Starting MySQL..[  OK  ]
[root@localhost recovery_mysql]# ps -ef|grep mysql
root     18948     1  0 19:57 pts/1    00:00:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql
--pid-file=/var/lib/mysql/localhost.localdomain.pid
mysql    19049 18948 14 19:57 pts/1    00:00:00 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql
--plugin-dir=/usr/lib64/mysql/plugin --user=mysql
--log-error=/var/lib/mysql/localhost.localdomain.err --pid-file=/var/lib/mysql/localhost.localdomain.pid
root     19078 15827  0 19:58 pts/1    00:00:00 grep mysql

创建抽取数据字典表

mysql> source dictionary/SYS_TABLES.sql
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
…………
mysql> source dictionary/SYS_INDEXES.sql
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
…………
mysql> source  dictionary/SYS_COLUMNS.sql
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
…………
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| SYS_COLUMNS    |
| SYS_INDEXES    |
| SYS_TABLES     |
+----------------+
3 rows in set (0.00 sec)

加载抽取数据字典数据

mysql> source dumps/default/SYS_TABLES.sql
Query OK, 0 rows affected (0.00 sec)
Query OK, 11 rows affected (0.03 sec)
Records: 11  Deleted: 0  Skipped: 0  Warnings: 0
mysql> source  dumps/default/SYS_INDEXES.sql
Query OK, 0 rows affected (0.00 sec)
Query OK, 39 rows affected (0.01 sec)
Records: 39  Deleted: 0  Skipped: 0  Warnings: 0
mysql> source dumps/default/SYS_COLUMNS.sql
Query OK, 0 rows affected (0.00 sec)
Query OK, 115 rows affected (0.00 sec)
Records: 115  Deleted: 0  Skipped: 0  Warnings: 0

验证抽取数据字典数据

mysql> desc SYS_TABLES
    -> ;
+--------------+---------------------+------+-----+---------+-------+
| Field        | Type                | Null | Key | Default | Extra |
+--------------+---------------------+------+-----+---------+-------+
| NAME         | varchar(255)        | NO   | PRI |         |       |
| ID           | bigint(20) unsigned | NO   |     | 0       |       |
| N_COLS       | int(10)             | YES  |     | NULL    |       |
| TYPE         | int(10) unsigned    | YES  |     | NULL    |       |
| MIX_ID       | bigint(20) unsigned | YES  |     | NULL    |       |
| MIX_LEN      | int(10) unsigned    | YES  |     | NULL    |       |
| CLUSTER_NAME | varchar(255)        | YES  |     | NULL    |       |
| SPACE        | int(10) unsigned    | YES  |     | NULL    |       |
+--------------+---------------------+------+-----+---------+-------+
8 rows in set (0.00 sec)
mysql> SELECT NAME,ID from SYS_TABLES WHERE NAME='xifenfei/get_dict';
+-------------------+----+
| NAME              | ID |
+-------------------+----+
| xifenfei/get_dict | 23 |
+-------------------+----+
1 row in set (0.00 sec)
mysql> desc SYS_COLUMNS
    -> ;
+----------+---------------------+------+-----+---------+-------+
| Field    | Type                | Null | Key | Default | Extra |
+----------+---------------------+------+-----+---------+-------+
| TABLE_ID | bigint(20) unsigned | NO   | PRI | NULL    |       |
| POS      | int(10) unsigned    | NO   | PRI | NULL    |       |
| NAME     | varchar(255)        | YES  |     | NULL    |       |
| MTYPE    | int(10) unsigned    | YES  |     | NULL    |       |
| PRTYPE   | int(10) unsigned    | YES  |     | NULL    |       |
| LEN      | int(10) unsigned    | YES  |     | NULL    |       |
| PREC     | int(10) unsigned    | YES  |     | NULL    |       |
+----------+---------------------+------+-----+---------+-------+
7 rows in set (0.00 sec)
mysql> SELECT TABLE_ID,NAME,MTYPE FROM SYS_COLUMNS WHERE TABLE_ID=23;
+----------+------+-------+
| TABLE_ID | NAME | MTYPE |
+----------+------+-------+
|       23 | id   |     6 |
|       23 | name |     1 |
+----------+------+-------+
2 rows in set (0.01 sec)
mysql> SELECT TABLE_ID,ID,NAME,TYPE FROM SYS_INDEXES WHERE TABLE_ID=23;
+----------+----+---------+------+
| TABLE_ID | ID | NAME    | TYPE |
+----------+----+---------+------+
|       23 | 25 | PRIMARY |    3 |
+----------+----+---------+------+
1 row in set (0.00 sec)

这里基本上可以看出来,在mysql数据库未启动情况下,使用工具可以正常抽取mysql数据字典信息

设置pdb随cdb一起启动

在Oracle 12.1.0.1版本中,在cdb数据库启动过程中,业务pdb无法自动open,如果要实现该功能,需要人工写触发器来实现cdb open后,pdb 给open起来.在12.1.0.2及其以后版本,可以通过设置ALTER PLUGGABLE DATABASE PDB SAVE STATE来实现在cdb open之后业务pdb能够自动open.
数据库启动后pdb未自动open

XFF_CDB$ROOT@SYS> startup
ORACLE 例程已经启动。
Total System Global Area  805306368 bytes
Fixed Size                  3050800 bytes
Variable Size             394265296 bytes
Database Buffers          297795584 bytes
Redo Buffers                5337088 bytes
In-Memory Area            104857600 bytes
数据库装载完毕。
数据库已经打开。
XFF_CDB$ROOT@SYS> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production              0
PL/SQL Release 12.1.0.2.0 - Production                                                    0
CORE    12.1.0.2.0      Production                                                        0
TNS for 64-bit Windows: Version 12.1.0.2.0 - Production                                   0
NLSRTL Version 12.1.0.2.0 - Production                                                    0
XFF_CDB$ROOT@SYS> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            MOUNTED

查询dba_pdb_saved_states无记录

XFF_CDB$ROOT@SYS>  select con_name, state from dba_pdb_saved_states;
未选定行

在数据库mount状态下save state

XFF_CDB$ROOT@SYS>  ALTER PLUGGABLE DATABASE PDB save state;
插接式数据库已变更。
XFF_CDB$ROOT@SYS>  select con_name, state from dba_pdb_saved_states;
未选定行

pdb为mount状态下,执行save state无记录,证明save state不成功

在数据库open状态下save state—-设置pdb随cdb启动

XFF_CDB$ROOT@SYS> ALTER PLUGGABLE DATABASE PDB open;
插接式数据库已变更。
XFF_CDB$ROOT@SYS>  ALTER PLUGGABLE DATABASE PDB save state;
插接式数据库已变更。
XFF_CDB$ROOT@SYS> col con_name for a20
XFF_CDB$ROOT@SYS>  select con_name, state from dba_pdb_saved_states;
CON_NAME             STATE
-------------------- --------------
PDB                  OPEN

pdb为open状态下,执save state成功.
需要注意save state需要在pdb open情况下执行才能够生效.

重启数据库测试pdb随cdb启动

XFF_CDB$ROOT@SYS> shutdown immediate
数据库已经关闭。
已经卸载数据库。
ORACLE 例程已经关闭。
XFF_CDB$ROOT@SYS> startup
ORACLE 例程已经启动。
Total System Global Area  805306368 bytes
Fixed Size                  3050800 bytes
Variable Size             364905168 bytes
Database Buffers          327155712 bytes
Redo Buffers                5337088 bytes
In-Memory Area            104857600 bytes
数据库装载完毕。
数据库已经打开。
XFF_CDB$ROOT@SYS> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            READ WRITE NO

禁用pdb随cdb启动—DISCARD STATE

XFF_CDB$ROOT@SYS>  ALTER PLUGGABLE DATABASE PDB DISCARD  state;
插接式数据库已变更。
XFF_CDB$ROOT@SYS>  select con_name, state from dba_pdb_saved_states;
未选定行
XFF_CDB$ROOT@SYS> shutdown immediate;
数据库已经关闭。
已经卸载数据库。
ORACLE 例程已经关闭。
XFF_CDB$ROOT@SYS> startup
ORACLE 例程已经启动。
Total System Global Area  805306368 bytes
Fixed Size                  3050800 bytes
Variable Size             364905168 bytes
Database Buffers          327155712 bytes
Redo Buffers                5337088 bytes
In-Memory Area            104857600 bytes
数据库装载完毕。
数据库已经打开。
XFF_CDB$ROOT@SYS> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            MOUNTED
XFF_CDB$ROOT@SYS>

12.1.0.1中设置pdb随cdb启动

CREATE TRIGGER open_all_pdbs
   AFTER STARTUP
   ON DATABASE
BEGIN
   EXECUTE IMMEDIATE 'alter pluggable database all open';
END open_all_pdbs;
/

SYSTEM表空间坏块恢复—C_TS#对象坏块恢复(file 1 block 60)

一朋友给我电话,说他们客户公司数据库故障,被另外一家公司恢复了一天不能正常恢复,请求我协助解决.接手一看数据库已经被破坏的不像样子了,根据alert日志信息大概分析了故障原因和上家公司处理情况。后面接手后通过bbed修复block数据库恢复过程,在本次恢复中出现大量ORA-600错误,主要包括ORA-00600 400,ORA-00600 2662,ORA-00600 2663,ORA-00600 krhpfh_03-1209,ORA-00600 3600,ORA-00600 ktsitbs_info1,ORA-00600 4137,ORA-00600 4511,ORA-00600 4198,ORA-00600 6807等
故障原因redo文件丢失

Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
LGWR: terminating instance due to error 313
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_pmon_1394.trc:
ORA-00313: open failed for members of log group  of thread
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_ckpt_1406.trc:
ORA-00313: open failed for members of log group  of thread
Instance terminated by LGWR, pid = 1404

尝试clear redo文件方式恢复

Thu Nov 20 13:04:16 2014
alter database clear logfile group 9
Thu Nov 20 13:04:16 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:45 2014
alter database clear logfile group 9
Thu Nov 20 13:04:46 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:59 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:04:59 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...
Thu Nov 20 13:05:00 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:05:00 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...

不完全恢复resetlogs尝试打开数据库

ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile  ...
Thu Nov 20 13:49:01 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER CANCEL
Thu Nov 20 13:49:03 2014
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL
Thu Nov 20 13:49:33 2014
alter database open resetlogs
Thu Nov 20 13:49:34 2014
ORA-1113 signalled during: alter database open resetlogs...

使用隐含参数

_allow_resetlogs_corruption= TRUE

进行不完全恢复,尝试open数据库报ORA-600 4000错误

Thu Nov 20 14:35:02 2014
ALTER DATABASE   MOUNT
Thu Nov 20 14:35:07 2014
Setting recovery target incarnation to 2
Thu Nov 20 14:35:07 2014
Successful mount of redo thread 1, with mount id 4039504598
Thu Nov 20 14:35:07 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 14:40:33 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 14:40:33 2014
Media Recovery Start
Thu Nov 20 14:40:33 2014
Media Recovery failed with error 1610
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 14:41:23 2014
ALTER DATABASE RECOVER  database using backup controlfile until cancel
Thu Nov 20 14:43:08 2014
alter database open resetlogs
Thu Nov 20 14:43:08 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31293973571
Resetting resetlogs activation ID 3855216310 (0xe5c9eeb6)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Online log /data2/oradata/redo0902.log: Thread 1 Group 9 was previously cleared
Thu Nov 20 14:43:14 2014
Setting recovery target incarnation to 3
Thu Nov 20 14:43:14 2014
Assigning activation ID 4039504598 (0xf0c5f2d6)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 14:43:14 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 14:43:14 2014
SMON: enabling cache recovery
Thu Nov 20 14:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1844
ORA-1092 signalled during: alter database open resetlogs...

尝试隐含屏蔽回滚段

_corrupted_rollback_segments= _SYSSMU1$, _SYSSMU2$,…………

错误依旧ORA-600 4000

Thu Nov 20 15:09:21 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 15:09:21 2014
SMON: enabling cache recovery
Thu Nov 20 15:09:21 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 624
ORA-1092 signalled during: alter database open

多次重启,resetlogs后,数据库出现ORA-600 2662错误

Successful open of redo thread 1
Thu Nov 20 17:13:24 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 17:13:24 2014
SMON: enabling cache recovery
Thu Nov 20 17:13:24 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 7967
ORA-1092 signalled during: ALTER DATABASE OPEN...
Thu Nov 20 17:18:23 2014
USER: terminating instance due to error 1092
Instance terminated by USER, pid = 7967

offline undo相关文件,尝试打开数据库

Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 17:52:31 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 17:52:31 2014
Media Recovery Start
 parallel recovery started with 15 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 17:53:42 2014
ALTER DATABASE RECOVER CANCEL
Thu Nov 20 17:53:44 2014
ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL ...
Thu Nov 20 17:56:34 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:56:35 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:57:01 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:02 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:26 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:27 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:43 2014
alter database open resetlogs
Thu Nov 20 17:57:43 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:58:58 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:58:58 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:59:15 2014
alter database open resetlogs
Thu Nov 20 17:59:15 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:59:35 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:35 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:50 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 17:59:50 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 18:00:07 2014
alter database open resetlogs
Thu Nov 20 18:00:07 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31294173628
Resetting resetlogs activation ID 4039492628 (0xf0c5c414)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Thu Nov 20 18:00:14 2014
Setting recovery target incarnation to 8
Thu Nov 20 18:00:14 2014
Assigning activation ID 4039504142 (0xf0c5f10e)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 18:00:15 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 18:00:15 2014
SMON: enabling cache recovery
Thu Nov 20 18:00:15 2014
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
File #2 is offline, but is part of an online tablespace.
data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
File #100 is offline, but is part of an online tablespace.
data file 100: '/data2/oradata/undotbs02.dbf'
Thu Nov 20 18:00:28 2014
File #185 is offline, but is part of an online tablespace.
data file 185: '/data2/oradata/undotbs03.dbf'
Dictionary check complete
Thu Nov 20 18:00:35 2014
SMON: enabling tx recovery
Thu Nov 20 18:00:36 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_28472.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_28450.trc:
ORA-00604: error occurred at recursive SQL level
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_28446.trc:
ORA-00604: error occurred at recursive SQL level
Instance terminated by USER, pid = 28472
ORA-1092 signalled during: alter database open resetlogs...

不知道做了什么操作出现file 1 block 60坏块,很可能bbed修改错误导致

Thu Nov 20 19:18:15 2014
SMON: enabling cache recovery
Thu Nov 20 19:18:16 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Successfully onlined Undo Tablespace 1.
Thu Nov 20 19:18:16 2014
SMON: enabling tx recovery
Thu Nov 20 19:18:17 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 19:18:17 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 13232
ORA-1092 signalled during: alter database open...

尝试不完全恢复,并resetlogs操作

ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 19:33:41 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:33:41 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:33:42 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:34:06 2014
alter database open resetlogs
Thu Nov 20 19:34:06 2014
ORA-1139 signalled during: alter database open resetlogs...
Thu Nov 20 19:34:17 2014
alter database open
Thu Nov 20 19:34:17 2014
ORA-1190 signalled during: alter database open...
Thu Nov 20 19:35:57 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 19:35:57 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:35:58 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:35:59 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:37:19 2014
alter database open resetlogs
Thu Nov 20 19:37:19 2014
ORA-1139 signalled during: alter database open resetlogs...

继续打开报 ORA-600 3600错误

Thu Nov 20 19:43:14 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 19:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
DBW0: terminating instance due to error 471
Instance terminated by DBW0, pid = 20856

<strong>中间多次重启和resetlogs,还出现ORA-600 2663错误</strong>

Fri Nov 21 12:35:12 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Nov 21 12:35:12 2014
SMON: enabling cache recovery
Fri Nov 21 12:35:13 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mman_15572.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw1_15576.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 15596
ORA-1092 signalled during: ALTER DATABASE OPEN..

继续尝试打开数据库出现ORA-600 ktsitbs_info1错误

SMON: enabling cache recovery
Fri Nov 21 13:54:25 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Fri Nov 21 13:54:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Fri Nov 21 13:54:27 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 21111
ORA-1092 signalled during: alter database open...

以上是客户数据库故障原因和问题大概的处理过程,下面是我接手后的处理过程


dbv 检查system01.dbf文件,得到结果

HNDX-DB% dbv file=/opt/oracle/oradata/xifenfei/system01.dbf
DBVERIFY: Release 10.2.0.1.0 - Production on Fri Nov 21 16:22:37 2014
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /opt/oracle/oradata/xifenfei/system01.dbf
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv:
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720
DBVERIFY - Verification complete
Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 2
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 2
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

这里知道数据库有两个坏块,而且根据对于bootstrap$的经验,可以大概确定60坏块很可能是C_TS#,第一反应type异常,498可能是seq$

对数据库启动过程做10046,得到trace文件

PARSING IN CURSOR #1 len=275 dep=2 uid=0 oct=3 lid=0 tim=27978051403575 hv=3408408745 ad='7df93cd0'
select name,online$,contents$,undofile#,undoblock#,blocksize,dflmaxext,dflinit,dflincr,dflextpct,dflminext,
dflminlen, owner#,scnwrp,scnbas, NVL(pitrscnwrp, 0), NVL(pitrscnbas, 0), dflogging, bitmapped, inc#, flags,
plugged, NVL(spare1,0), NVL(spare2,0) from ts$ where ts#=:1
END OF STMT
PARSE #1:c=0,e=92,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051403569
BINDS #1:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=ffffffff7dbac9a8  bln=22  avl=02  flg=05
  value=2
EXEC #1:c=0,e=310,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051404296
WAIT #1: nam='db file sequential read' ela= 42 file#=1 block#=60 blocks=1 obj#=-1 tim=27978051404449
Hex dump of (file 1, block 60)
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
FETCH #1:c=10000,e=4072,p=1,cr=2,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051408438
STAT #1 id=1 cnt=0 pid=0 pos=1 obj=16 op='TABLE ACCESS CLUSTER TS$ (cr=2 pr=1 pw=0 time=4075 us)'
STAT #1 id=2 cnt=1 pid=1 pos=1 obj=7 op='INDEX UNIQUE SCAN I_TS# (cr=1 pr=0 pw=0 time=13 us)'
*** 2014-11-22 14:44:43.235
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Current SQL statement for this session:
select max(maxconcurrency) from sys.wrh$_undostat  where instance_number = :1 and dbid = :2
and snap_id in   (select snap_id from dba_hist_snapshot where end_interval_time >
(select max(end_interval_time)-7 from dba_hist_snapshot))

这里显示了数据库启动报ORA-00600[ktsitbs_info1],[2],明显的表示了b中的2是表示表空间号,由于ts$坏块,无法读取ts$中表空间信息,从而出现数据字典不一致,从而出现该错误。所以恢复该库的关键是修复file 1 block 60.

bbed尝试修复file 1 block 60

HNDX-DB% bbed password=blockedit mode=edit
BBED: Release 2.0.0.0.0 - Limited Production on Sat Nov 22 15:16:26 2014
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename '/opt/oracle/oradata/xifenfei/system01.dbf'
        FILENAME        /opt/oracle/oradata/xifenfei/system01.dbf
BBED> set block 8192
        BLOCK#          8192
BBED> set block 60
        BLOCK#          60
BBED> set count 64
        COUNT           64
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (128)
BBED> set block 61
        BLOCK#          61
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 72 bytes                     @20
 struct kdbh, 14 bytes                      @92
 struct kdbt[3], 12 bytes                   @106
 sb2 kdbr[2]                                @118
 ub1 freespace[7959]                        @122
 ub1 rowdata[107]                           @8081
 ub4 tailchk                                @8188
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x0040003d
   ub4 bas_kcbh                             @8        0x0000235b
   ub2 wrp_kcbh                             @12       0x0000
   ub1 seq_kcbh                             @14       0x01
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x7a85
   ub2 spare3_kcbh                          @18       0x0000
BBED> set block 60
        BLOCK#          60
BBED> d
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 80000000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000
 <32 bytes per line>
BBED> d block 61
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003d 0000235b 00000104 7a850000 01000000 00000006 00001837
 00001738 00020200 00000000 0007002e 00000002 00800075 00012300 80000000
 <32 bytes per line>
BBED> set block 60
        BLOCK#          60
BBED> m /x 06a2
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000
 <32 bytes per line>
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 72 bytes                     @20
 struct kdbh, 14 bytes                      @92
 struct kdbt[3], 12 bytes                   @106
 sb2 kdbr[2]                                @118
 ub1 freespace[7598]                        @122
 ub1 rowdata[468]                           @7720
 ub4 tailchk                                @8188
BBED> sum apply
Check value for File 0, Block 60:
current = 0xe908, required = 0xe908
BBED> verify
DBVERIFY - Verification starting
FILE = /opt/oracle/oradata/xifenfei/system01.dbf
BLOCK = 60
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
BBED>

尝试启动数据库

Sat Nov 22 15:51:33 2014
alter database open
Sat Nov 22 15:51:34 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 15:51:34 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 15:51:34 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 15:51:34 2014
Database Characterset is ZHS16CGB231280
Hex dump of (file 1, block 498) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_2818.trc
Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720
Reread of rdba: 0x004001f2 (file 1, block 498) found same corrupted data
Sat Nov 22 15:51:35 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_2803.trc:
ORA-00600: internal error code, arguments: [4000], [12], [], [], [], [], [], []
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=18, OS id=3000
Sat Nov 22 15:51:36 2014
Completed: alter database open
Sat Nov 22 15:51:36 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_3010.trc:
ORA-00600: internal error code, arguments: [6807], [AUDSES$], [144], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [6807], [WRI$_ALERT_SEQUENCE], [8783], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Non-fatal internal error happenned while SMON was doing non-existent object cleanup.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Sat Nov 22 15:51:38 2014
ORA-600 encountered when generating server alert SMG-3000
Sat Nov 22 15:51:38 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [ktcpoptx_0], [0x772705E60], [], [], [], [], [], []

只要出现ORA-600 4000和ORA-600 6807错误,其中ORA-600 6807错误比较明显是由于seq$坏块,导致AUDSES$ seq异常导致。ORA-600 4000应该是回滚段异常,继续分析回滚段

SQL> select name,ts#,status$ from undo$;
NAME                                  TS#    STATUS$
------------------------------ ---------- ----------
SYSTEM                                  0          2
_SYSSMU1$                               1          2
_SYSSMU2$                               1          2
_SYSSMU3$                               1          2
…………
_SYSSMU168$                             1          2
_SYSSMU169$                             1          2

这里很异常,system回滚段在数据库open之后,按照常理不可能处于STATUS$=2(OFFLINE)状态。而且其他回滚段全部为OFFLINE状态也属于异常情况.而且尝试drop undo报ORA-01561,另外在dba_rollback_segs中无SYSTEM(查询结果忘记保存)

SQL> drop tablespace undotbs1 including contents;
drop tablespace undotbs1 including contents
*
ERROR at line 1:
ORA-01561: failed to remove all objects in the tablespace specified

通过这一系列很怀疑是由于bbed 修改了undo$等相关基表信息导致现在system中的undo信息混乱.信息反馈给客户后,客户想起来昨天给他们恢复的公司在bbed操作前备份了system01.dbf.突然感觉救星来了.实在怕不懂bbed的人折腾bbed

dbv检测备份文件

DBVERIFY - Verification starting : FILE = /data3/backup/system01.dbf_bak
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv:
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Block Checking: DBA = 4194802, Block Type = KTB-managed data block
data header at 0x1002ef05c
kdbchk: row locked by non-existent transaction
        table=0   slot=4
        lockid=1   ktbbhitc=2
Page 498 failed with check code 6101
DBVERIFY - Verification complete
Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 1
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

好家伙只有一个物理坏块和一个逻辑坏块,而对于物理坏块block 60已经知道如何修复,逻辑坏块可以尝试设置隐含参数跳过去,bbed修改相关block(同上步骤)

再次启动数据库

dd if=/opt/oracle/oradata/xifenfei/system01.dbf bs=8192 count=2 of=/tmp/system01.2
dd if=/tmp/system01.2 of=/data3/backup/system01.dbf_bak bs=8192 count=2 conv=notrunc
Sat Nov 22 17:52:50 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Sat Nov 22 17:53:38 2014
alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:53:39 2014
Completed: alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:55:43 2014
alter database open
Sat Nov 22 17:55:48 2014
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=18, OS id=15858
Sat Nov 22 17:56:10 2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=17, OS id=15879
Sat Nov 22 17:56:19 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 17:56:19 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 17:56:19 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 17:56:20 2014
ARC1: STARTING ARCH PROCESSES
Sat Nov 22 17:56:20 2014
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Sat Nov 22 17:56:22 2014
Database Characterset is ZHS16CGB231280
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Nov 22 17:56:33 2014
ARC2: Archival started
ARC1: STARTING ARCH PROCESSES COMPLETE
ARC1: Becoming the heartbeat ARCH
ARC2 started with pid=23, OS id=15928
QMNC started with pid=25, OS id=15996
Sat Nov 22 17:57:11 2014
Completed: alter database open
Sat Nov 22 17:57:18 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16010.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:57:26 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16012.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:58:17 2014
Starting background process EMN0
Sat Nov 22 18:00:03 2014
Shutting down instance: further logons disabled
EMN0 started with pid=71, OS id=16421
Sat Nov 22 18:00:12 2014
SMON: Restarting fast_start parallel rollback
Sat Nov 22 18:00:23 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_p000_15951.trc:
ORA-00600: internal error code, arguments: [4198], [9], [], [], [], [], [], []
Sat Nov 22 18:00:24 2014
Stopping background process CJQ0
Sat Nov 22 18:00:24 2014
Stopping background process QMNC
Sat Nov 22 18:00:27 2014
Doing block recovery for file 2 block 41
Block recovery from logseq 7, block 180883 to scn 214748389244
Sat Nov 22 18:00:27 2014
Recovery of Online Redo Log: Thread 1 Group 8 Seq 7 Reading mem 0
  Mem# 0 errs 0: /data2/oradata/redo0802.log
Block recovery stopped at EOT rba 7.180988.16
Block recovery completed at rba 7.180988.16, scn 50.24441
Sat Nov 22 18:00:32 2014
Stopping background process MMNL
Sat Nov 22 18:00:38 2014
Stopping background process MMON
Sat Nov 22 18:00:41 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sat Nov 22 18:00:42 2014
ORACLE Instance xifenfei (pid = 9) - Error 600 encountered while recovering transaction (3, 4).
Sat Nov 22 18:00:42 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []

这里都是很常规的错误,查询undo$也已经正常,重建新undo表空间删除老undo,然后alert日志中无其他报错,数据库恢复至此完成,建议客户导出导入重建数据库

对于IN Memory Option 部分细节测试—主要当inmemory_size不足之时

本文对于IMDB的几个特性进行了具体测试:
1. 压缩级别和压缩率(具体也需要具体测试),本实验仅提供参考
2. 对于IM空间不足已经存在的对象和加入新对象的现象
3. 对于PRIORITY级别进行了简单测试
数据库基本配置信息

SQL> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production              0
PL/SQL Release 12.1.0.2.0 - Production                                                    0
CORE    12.1.0.2.0      Production                                                        0
TNS for 64-bit Windows: Version 12.1.0.2.0 - Production                                   0
NLSRTL Version 12.1.0.2.0 - Production                                                    0
SQL> show parameter inmemory;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default              string
inmemory_force                       string      DEFAULT
inmemory_max_populate_servers        integer     2
inmemory_query                       string      ENABLE
inmemory_size                        big integer 100M
inmemory_trickle_repopulate_servers_ integer     1
percent
optimizer_inmemory_aware             boolean     TRUE

数据库版本12.1.0.2,inmemory_size配置为100M

准备测试环境

SQL> create tablespace inmemory datafile 'D:\APP\FFCHENG\ORADATA\XFF\PDB\in_memory01.dbf'
   2  size 100m autoextend on next 4m maxsize 10g;
表空间已创建。
SQL> create user chf identified by xifenfei;
用户已创建。
SQL> grant dba to chf;
授权成功。
SQL> alter user chf default tablespace inmemory;
用户已更改。
SQL> create table chf.t_inmemory1 as select * from dba_objects;
表已创建。

创建测试表空间,用户,测试表

测试压缩级别

SQL> alter table chf.t_inmemory1 inmemory NO MEMCOMPRESS;
表已更改。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
未选定行
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488      10616832 NONE     NO MEMCOMPRESS
--NO MEMCOMPRESS 压缩比例非常小,基本上不压缩
SQL> alter table chf.t_inmemory1 no inmemory ;
表已更改。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
未选定行
SQL>  alter table chf.t_inmemory1 inmemory MEMCOMPRESS FOR DML
  2  ;
表已更改。
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488      10616832 NONE     FOR DML
--FOR DML 压缩比例非常小,基本上不压缩和NO MEMCOMPRESS在压缩效果上类似
SQL> alter table chf.t_inmemory1 no inmemory ;
表已更改。
SQL> alter table chf.t_inmemory1 inmemory MEMCOMPRESS FOR QUERY LOW;
表已更改。
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488       4325376 NONE     FOR QUERY LOW
-- FOR QUERY LOW为默认压缩级别,这里看压缩比例在3:1左右,具体取决于数据
SQL> alter table chf.t_inmemory1 no inmemory ;
表已更改。
SQL> alter table chf.t_inmemory1 inmemory MEMCOMPRESS FOR QUERY HIGH;
表已更改。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
未选定行
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488       3276800 NONE     FOR QUERY HIGH
-- FOR QUERY HIGH,这里看压缩比例在4:1左右,具体取决于数据
SQL> alter table chf.t_inmemory1 no inmemory ;
表已更改。
SQL> alter table chf.t_inmemory1 inmemory MEMCOMPRESS FOR CAPACITY LOW;
表已更改。
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488       2228224 NONE     FOR CAPACITY LOW
-- FOR CAPACITY LOW,这里看压缩比例在6:1左右,具体取决于数据
SQL> alter table chf.t_inmemory1 no inmemory ;
表已更改。
SQL> alter table chf.t_inmemory1 inmemory MEMCOMPRESS FOR CAPACITY HIGH;
表已更改。
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488        131072 NONE     FOR CAPACITY HIGH
-- FOR CAPACITY HIGH,这里看压缩比例在10:1左右,具体取决于数据

这里可以看出来,压缩效果确实如Oracle所描述,级别越高压缩效果越好.

测试inmemory_size大小不足之时

SQL> alter table chf.t_inmemory1 inmemory no MEMCOMPRESS;
表已更改。
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488      10616832 NONE     NO MEMCOMPRESS
--dml插入数据,不再次查询数据,v$im_segments.inmemory_size不发生改变(这个是bug还是设计考虑??)
SQL> insert into chf.t_inmemory1 select * from chf.t_inmemory1;
已创建 91040 行。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  26214400      10616832 NONE     NO MEMCOMPRESS
SQL> commit;
提交完成。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  26214400      10616832 NONE     NO MEMCOMPRESS
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
    182080
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  26214400      22282240 NONE     NO MEMCOMPRESS
SQL> insert into chf.t_inmemory1 select * from chf.t_inmemory1;
已创建 182080 行。
SQL> commit;
提交完成。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  51380224      22282240 NONE     NO MEMCOMPRESS
--通过10046证明,虽然v$im_segments.inmemory_size值未及时更新,但是IMDB是生效的
SQL> oradebug setmypid
已处理的语句
SQL> alter session set db_file_multiblocK_read_count=1;
会话已更改。
SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12
已处理的语句
SQL> oradebug TRACEFILE_NAME
D:\APP\FFCHENG\diag\rdbms\xff\xff\trace\xff_ora_7604.trc
SQL> SELECT COUNT(object_id) FROM chf.t_inmemory1;
COUNT(OBJECT_ID)
----------------
          364156
SQL> oradebug EVENT 10046 trace name context off
已处理的语句
PARSING IN CURSOR #455134016 len=44 dep=0 uid=0 oct=3 lid=0 tim=126773093621 hv=1133975269
 ad='7ff07339500' sqlid='5909ukj1tf5r5'
SELECT COUNT(object_id) FROM chf.t_inmemory1
END OF STMT
PARSE #455134016:c=15600,e=3912,p=0,cr=2,cu=0,mis=1,r=0,dep=0,og=1,plh=3154396630,tim=126773093620
WAIT #455134016: nam='Disk file operations I/O' ela= 154 FileOperation=8 fileno=0 filetype=8 obj#=-1 tim=126773093926
EXEC #455134016:c=0,e=25,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3154396630,tim=126773094005
WAIT #455134016: nam='SQL*Net message to client' ela= 2 driver id=1111838976 #bytes=1 p3=0 obj#=-1 tim=126773094044
FETCH #455134016:c=0,e=13751,p=0,cr=3110,cu=1,mis=0,r=1,dep=0,og=1,plh=3154396630,tim=126773107829
STAT #455134016 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=3110 pr=0 pw=0 time=13751 us)'
STAT #455134016 id=2 cnt=364160 pid=1 pos=1 obj=91914 op='TABLE ACCESS INMEMORY FULL T_INMEMORY1 (cr=3110 pr=0
  pw=0 time=5386 us cost=17 size=455200 card=91040)'
WAIT #455134016: nam='SQL*Net message from client' ela= 116 driver id=1111838976 #bytes=1 p3=0 obj#=-1 tim=126773108164
FETCH #455134016:c=0,e=1,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,plh=3154396630,tim=126773108215
WAIT #455134016: nam='SQL*Net message to client' ela= 0 driver id=1111838976 #bytes=1 p3=0 obj#=-1 tim=126773108246
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  51380224      43384832 NONE     NO MEMCOMPRESS
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
    364160
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  51380224      43384832 NONE     NO MEMCOMPRESS
SQL> insert into chf.t_inmemory1 select * from chf.t_inmemory1;
已创建 364160 行。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 109051904      43384832 NONE     NO MEMCOMPRESS
SQL> commit;
提交完成。
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
    728320
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 109051904      43384832 NONE     NO MEMCOMPRESS
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 109051904      71892992 NONE     NO MEMCOMPRESS
--这里可以看出来INMEMORY_SIZE已经使用了71892992,再插入一次数据,一共100M的IM肯定不够使用
SQL> insert into chf.t_inmemory1 select * from chf.t_inmemory1;
已创建 728320 行。
SQL> commit;
提交完成。
SQL> select count(object_id) from chf.t_inmemory1;
COUNT(OBJECT_ID)
----------------
         1456624
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 201326592      63438848 NONE     NO MEMCOMPRESS
--这里现在的INMEMORY_SIZE变为了63438848小于在插入数据之前的71892992,证明IM肯定出现问题,比如已经满了,
  v$im_segments显示值不准确
--测试刷新buffer_cache对IM的影响
SQL> alter system flush buffer_cache;
系统已更改。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 201326592      63438848 NONE     NO MEMCOMPRESS
--结果证明无影响
autotrace结果
SQL> set autot trace exp stat
SQL> set lines 120
SQL> pages 1000
SQL> set pages 1000
SQL> select count(*) from chf.t_inmemory1;
执行计划
----------------------------------------------------------
Plan hash value: 3154396630
-----------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    16   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |             |     1 |            |          |
|   2 |   TABLE ACCESS INMEMORY FULL| T_INMEMORY1 | 91040 |    16   (0)| 00:00:01 |
-----------------------------------------------------------------------------------
统计信息
----------------------------------------------------------
          5  recursive calls
          0  db block gets
      16693  consistent gets
      16690  physical reads
          0  redo size
        546  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
10046结果
SQL ID: 1b61dgunxftdx Plan Hash: 3154396630
select count(object_id)
from
 chf.t_inmemory1
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          2          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      1.26       4.14      16689      22446          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      1.26       4.14      16689      22448          0           1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Number of plan statistics captured: 1
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=22446 pr=16689 pw=0 time=4144536 us)
   1456640    1456640    1456640   TABLE ACCESS INMEMORY FULL T_INMEMORY1 (cr=22446 pr=16689 pw=0
                                         time=2560999 us cost=17 size=455200 card=91040)
Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                        1        0.00          0.00
  SQL*Net message to client                       2        0.00          0.00
  db file sequential read                     16689        0.03          3.05
  SQL*Net message from client                     2        5.40          5.40
--autotrace和10046都证明,当IM size不足之时,数据库未能够使用IM的特性,哪怕是部分也不能使用
--创建新对象存放IM中
SQL> create table chf.t_inmemory2 as select * from dba_objects;
表已创建。
SQL> alter table chf.t_inmemory2 inmemory;
表已更改。
SQL> select count(*) from chf.t_inmemory2;
  COUNT(*)
----------
     91041
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 201326592      63438848 NONE     NO MEMCOMPRESS
SQL> select count(*) from chf.t_inmemory2;
执行计划
----------------------------------------------------------
Plan hash value: 2042227318
-----------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    16   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |             |     1 |            |          |
|   2 |   TABLE ACCESS INMEMORY FULL| T_INMEMORY2 | 91041 |    16   (0)| 00:00:01 |
-----------------------------------------------------------------------------------
统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1532  consistent gets
       1530  physical reads
          0  redo size
        545  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
SQL> alter table chf.t_inmemory1 no inmemory;
表已更改。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
   2  from v$im_segments where segment_name = 'T_INMEMORY2';
未选定行
SQL> set autot traceonly exp stat
SQL> select count(*) from chf.t_inmemory2;
执行计划
----------------------------------------------------------
Plan hash value: 2042227318
-----------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    16   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |             |     1 |            |          |
|   2 |   TABLE ACCESS INMEMORY FULL| T_INMEMORY2 | 91041 |    16   (0)| 00:00:01 |
-----------------------------------------------------------------------------------
统计信息
----------------------------------------------------------
         57  recursive calls
          0  db block gets
       1565  consistent gets
       1532  physical reads
          0  redo size
        545  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
          1  rows processed
SQL> set autot off
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
   2  from v$im_segments where segment_name = 'T_INMEMORY2';
SEGMENT_NAME
-----------------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY2
  13631488       4325376 NONE     FOR QUERY LOW
SQL> set autot traceonly exp stat
SQL> select count(*) from chf.t_inmemory2;
执行计划
----------------------------------------------------------
Plan hash value: 2042227318
-----------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    53   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |             |     1 |            |          |
|   2 |   TABLE ACCESS INMEMORY FULL| T_INMEMORY2 | 91041 |    53   (0)| 00:00:01 |
-----------------------------------------------------------------------------------
统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        545  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
--当IM已经无空闲空间之时,创建新对象在PRIORITY未提升之前,即便是设置了IM和对对象进行了访问,也无法存入IM

上述测试几个结论:
1. 随着IM中对象的增加,当INMEMORY_SIZE不足之时,v$im_segments.INMEMORY_SIZE显示不准确
2. 随着IM中对象的增加,当INMEMORY_SIZE不足之时,当IM中的对象不能全部在IM中之时,对其对象操作,会转换成传统数据库操作,
不会使用部分的IM特性,但是执行计划依然提示使用INMEMORY
3. flush buffer_cache 不影响对象的IM
4. 当IM已经无空闲空间之时,创建新对象在PRIORITY未提升之前,即便是设置了IM和对对象进行了访问,也无法存入IM,
访问依然是传统方式,但是执行计划是INMEMORY

测试PRIORITY

SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION from v$im_segments;
SEGMENT_NAME
------------------------------------------------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 201326592      57999360 NONE     NO MEMCOMPRESS
SQL> alter table chf.t_inmemory2  inmemory;
表已更改。
SQL> select count(*) from chf.t_inmemory2;
  COUNT(*)
----------
     91041
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION from v$im_segments;
SEGMENT_NAME
------------------------------------------------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 201326592      57999360 NONE     NO MEMCOMPRESS
SQL> alter table chf.t_inmemory1  inmemory no memcompress PRIORITY LOW;
表已更改。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION from v$im_segments;
未选定行
SQL> select count(*) from chf.t_inmemory2;
  COUNT(*)
----------
     91041
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION from v$im_segments;
SEGMENT_NAME
------------------------------------------------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY2
  13631488       4325376 NONE     FOR QUERY LOW

这里测试证明:
1. 指定PRIORITY不为none也需要访问对象后才能够放入IM中
2. 当IM不足时,PRIORITY级别高的会把级别低的对象刷出IM

特此声明:本文仅出自个人测试,得出结论,不可作为任何官方依据使用,具体环境需要具体测试

记录一次rm -rf 删除数据文件异常恢复

因为人员离职闹得不愉快,系统工程师离职后,由于公司未及时关闭其vpn,数据库服务器(Linux 6.5 Oracle 11.2.0.1)帐号未及时被修改,最后直接上去rm ORACLE_BASE给干掉,悲剧的是ORADATA目录也在里面,更加悲剧的是所有数据文件都在里面.也就是说数据库彻底被删除,而且没有任何备份.朋友咨询了我,让我给予支持.最后比较幸运,文件没有被覆盖,inode都还在,通过extundelete顺利恢复所有数据文件,控制文件,redo文件(extundelete恢复Linux被删除文件),数据库顺利打开,实现0丢失,算是一次完美的恢复

[root@DB1 tmp]# tar xvf extundelete-0.2.4.tar
extundelete-0.2.4/
extundelete-0.2.4/acinclude.m4
extundelete-0.2.4/missing
extundelete-0.2.4/autogen.sh
extundelete-0.2.4/aclocal.m4
extundelete-0.2.4/configure
extundelete-0.2.4/LICENSE
extundelete-0.2.4/README
extundelete-0.2.4/install-sh
extundelete-0.2.4/config.h.in
extundelete-0.2.4/src/
extundelete-0.2.4/src/extundelete.cc
extundelete-0.2.4/src/block.h
extundelete-0.2.4/src/kernel-jbd.h
extundelete-0.2.4/src/insertionops.cc
extundelete-0.2.4/src/block.c
extundelete-0.2.4/src/cli.cc
extundelete-0.2.4/src/extundelete-priv.h
extundelete-0.2.4/src/extundelete.h
extundelete-0.2.4/src/jfs_compat.h
extundelete-0.2.4/src/Makefile.in
extundelete-0.2.4/src/Makefile.am
extundelete-0.2.4/configure.ac
extundelete-0.2.4/depcomp
extundelete-0.2.4/Makefile.in
extundelete-0.2.4/Makefile.am
[root@DB1 tmp]# cd extundelete-0.2.4
[root@DB1 extundelete-0.2.4]# ./configure
Configuring extundelete 0.2.4
Writing generated files to disk
[root@DB1 extundelete-0.2.4]# make && make install
make -s all-recursive
Making all in src
Making install in src
  /usr/bin/install -c extundelete '/usr/local/bin'
[root@DB1 extundelete-0.2.4]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       244G   11G  221G   5% /
tmpfs            16G   72K   16G   1% /dev/shm
/dev/sda1       190M   62M  119M  35% /boot
/dev/sdb1       2.0T   71M  1.9T   1% /home
[root@DB1 extundelete-0.2.4]# umount /dev/sdb1
umount: /home: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
/home:                3914c(oracle)  8372c(oracle)
[root@DB1 extundelete-0.2.4]# kill -9 3914
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
/home:                8372c(oracle)
[root@DB1 extundelete-0.2.4]# kill -9 8372
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
[root@DB1 extundelete-0.2.4]# umount /dev/sdb1
[root@DB1 extundelete-0.2.4]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       244G   11G  221G   5% /
tmpfs            16G   72K   16G   1% /dev/shm
/dev/sda1       190M   62M  119M  35% /boot
[root@DB1 extundelete-0.2.4]# extundelete /dev/sdb1 --restore-all
NOTICE: Extended attributes are not restored.
Loading filesystem metadata ... 16384 groups loaded.
Loading journal descriptors ... 26542 descriptors loaded.
Searching for recoverable inodes in directory / ...
18896 recoverable inodes found.
Looking through the directory structure for deleted files ...
2 recoverable inodes still lost.
Unable to restore inode 43778050 (file.43778050): Space has been reallocated.
[root@DB1 extundelete-0.2.4]# ls
acinclude.m4  autogen.sh  config.h.in  config.status  configure.ac  install-sh  Makefile     Makefile.in
aclocal.m4    config.h    config.log   configure      depcomp       LICENSE     Makefile.am  missing
[root@DB1 extundelete-0.2.4]# cd RECOVERED_FILES/
[root@DB1 RECOVERED_FILES]# ls
app  file.43778051  oracle  oraInventory
[root@DB1 RECOVERED_FILES]# cd app
[root@DB1 app]# ls
admin  cfgtoollogs  diag  oracle  oradata  orcl  ORCL
[root@DB1 app]# cd oradata
[root@DB1 oradata]# ls
orcl
[root@DB1 oradata]# cd orcl
[root@DB1 orcl]# ls
control01.ctl  redo01.log  redo02.log  redo03.log  sysaux01.dbf  system01.dbf  undotbs01.dbf  users01.dbf
[root@DB1 orcl]# ls -ltr
total 2908776
-rw-r--r--. 1 root root  734011392 Nov 18 02:06 system01.dbf
-rw-r--r--. 1 root root 1069555712 Nov 18 02:06 sysaux01.dbf
-rw-r--r--. 1 root root  120594432 Nov 18 02:06 undotbs01.dbf
-rw-r--r--. 1 root root  887365632 Nov 18 02:06 users01.dbf
-rw-r--r--. 1 root root    9748480 Nov 18 02:06 control01.ctl
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo01.log
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo02.log
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo03.log
[root@DB1 orcl]#

再次提醒各位:数据库备份重于一切,防天灾的同时还要防人灾,也希望圈子里面以后不要听到类似故障.

In-Memory整体汇总

本问是对于Oracle 12C中的In-Memory Column Store一个整体的汇总,具体细节知识在以后章节中展开
IM可以针对如下级别进行操作
Column
Table
Materialized view
Tablespace
Partition

可以指定In-Memory操作语句
CREATE TABLE
ALTER TABLE
CREATE TABLESPACE
ALTER TABLESPACE
CREATE MATERIALIZED VIEW
ALTER MATERIALIZED VIEW

压缩级别
IM-Compression-Methods
优先级
IM-Priority-Levels
对象级别操作IM

CREATE TABLE t_xifenfei (
     id        NUMBER(5) PRIMARY KEY,
     test_col  VARCHAR2(15))
  INMEMORY;
ALTER TABLE t_xifenfei INMEMORY;
ALTER TABLE t_xifenfei INMEMORY MEMCOMPRESS FOR CAPACITY LOW;
ALTER TABLE t_xifenfei INMEMORY PRIORITY HIGH;
ALTER TABLE t_xifenfei INMEMORY
  MEMCOMPRESS FOR CAPACITY HIGH
  PRIORITY LOW;
ALTER TABLE t_xifenfei
   INMEMORY MEMCOMPRESS FOR QUERY (
      product_id, product_name, category_id, supplier_id, min_price)
   INMEMORY MEMCOMPRESS FOR CAPACITY HIGH (
      product_description, warranty_period, product_status, list_price)
   NO INMEMORY (
      weight_class, catalog_url);
ALTER TABLE t_xifenfei NO INMEMORY;

补充说明:列级别设置的优先级无效,优先级是表(物化视图)或者分区表级别

表空间级别操作IM

CREATE TABLESPACE xifenfie_im
   DATAFILE '/u02/xifenfei.dbf' SIZE 40M
   ONLINE
   DEFAULT INMEMORY;
ALTER TABLESPACE xifenfie_im DEFAULT INMEMORY
   MEMCOMPRESS FOR CAPACITY HIGH
   PRIORITY LOW;

物化视图级别

CREATE MATERIALIZED VIEW oe.prod_info_mv INMEMORY
  AS SELECT * FROM t_xifenfei;
ALTER MATERIALIZED VIEW oe.prod_info_mv INMEMORY PRIORITY HIGH;

适合使用IN-Memory操作
A query that scans a large number of rows and applies filters that use operators such as the following: =, <, >, and IN
A query that selects a small number of columns from a table or materialized view with a large number of columns,
such as a query that selects five columns from a table with 100 columns
A query that joins a small table to a large table
A query that aggregates data

不适合使用IN-Memory操作
Queries with complex predicates
Queries that select a large number of columns
Queries that return a large number of rows
Queries with multiple large table joins

IM控制参数
INMEMORY_SIZE 指定IM分配内存大小,默认值为0,如果启动该值最小为100M;如果在CDB环境中使用,CDB级别设置为整个库级别限制,PDB默认继承CDB设置,但是在实际使用中PDB中总数不能超过CDB限制
INMEMORY_FORCE 指定是否允许数据库中对象使用IM,默认是DEFAULT,即可以实现在对象级别定义INMEMORY or NO INMEMORY,如果设置为OFF 即表示表或者物化视图无法使用IM
INMEMORY_CLAUSE_DEFAULT 默认为空,和NO INMEMORY意义相同,表示创建新对象默认不启用IM,如果配置为INMEMORY,表示新创建对象默认启用IM
INMEMORY_QUERY 默认为TRUE,表示查询是否使用IM特性,设置为FALSE表示查询不使用IM特性
INMEMORY_MAX_POPULATE_SERVERS 默认和系统core一致,用途是把你的表中数据写入到IM中
INMEMORY_TRICKLE_REPOPULATE_SERVERS_PERCENT 控制IM中对象数据的重新载入的进程数,该值为INMEMORY_MAX_POPULATE_SERVERS参数的百分比
OPTIMIZER_INMEMORY_AWARE 该参数是控制优化器成本计算时是否考虑IM,默认为TRUE

impdp 操作IM
TRANSFORM=INMEMORY:y 继承IM导出对象属性
TRANSFORM=INMEMORY:n 不继承IM导出对象属性
TRANSFORM=INMEMORY_CLAUSE:string 修改IM导出对象关于IM的属性

参考文档:https://docs.oracle.com/database/121/ADMIN/memory.htm#ADMIN14257