执行计划改变导致数据库负载过高

核心生产库突然负载飙升,我的压力来了,通过分析是一条sql的执行计划改变引起该故障,最终通过sql profile固定执行计划解决该问题
数据库主机负载
这里明显表现系统load 偏高,而且还在上升中;top的进程中,占用cpu都计划100%

top - 16:25:39 up 123 days,  1:42,  4 users,  load average: 46.19, 45.08, 43.93
Tasks: 1469 total,  28 running, 1439 sleeping,   0 stopped,   2 zombie
Cpu(s): 45.9%us,  1.1%sy,  0.0%ni, 47.1%id,  5.2%wa,  0.1%hi,  0.6%si,  0.0%st
Mem:  264253752k total, 262605260k used,  1648492k free,   413408k buffers
Swap: 33554424k total,   458684k used, 33095740k free, 67110504k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2622 oracle    18   0  150g  34m  28m R 100.0  0.0  11:58.21 oracleq9db1 (LOCAL=NO)
15881 oracle    21   0  150g  35m  28m R 100.0  0.0  72:14.20 oracleq9db1 (LOCAL=NO)
17214 oracle    19   0  150g  38m  32m R 100.0  0.0  20:47.44 oracleq9db1 (LOCAL=NO)
17705 oracle    21   0  150g  34m  28m R 100.0  0.0  27:33.00 oracleq9db1 (LOCAL=NO)
 6110 oracle    19   0  150g  34m  28m R 99.8  0.0  12:33.34 oracleq9db1 (LOCAL=NO)
 6876 oracle    19   0  150g  34m  27m R 99.5  0.0  12:43.90 oracleq9db1 (LOCAL=NO)
17205 oracle    23   0  150g  34m  27m R 99.5  0.0  21:37.18 oracleq9db1 (LOCAL=NO)
24629 oracle    20   0  150g  35m  29m R 99.5  0.0  28:10.62 oracleq9db1 (LOCAL=NO)
26959 oracle    19   0  150g  34m  27m R 99.5  0.0  47:17.87 oracleq9db1 (LOCAL=NO)
 7655 oracle    18   0  150g  30m  25m R 98.5  0.0   2:28.45 oracleq9db1 (LOCAL=NO)
16377 oracle    18   0  150g  34m  28m R 98.5  0.0  36:07.11 oracleq9db1 (LOCAL=NO)
24637 oracle    18   0  150g  37m  30m R 98.2  0.0  26:39.15 oracleq9db1 (LOCAL=NO)
 6106 oracle    21   0  150g  40m  33m R 97.2  0.0  11:37.75 oracleq9db1 (LOCAL=NO)
28785 oracle    18   0  150g  34m  28m R 96.9  0.0  24:03.29 oracleq9db1 (LOCAL=NO)
24278 oracle    17   0  150g  31m  26m S 96.5  0.0   3:15.51 oracleq9db1 (LOCAL=NO)
24283 oracle    17   0  150g  33m  28m S 96.5  0.0   6:25.26 oracleq9db1 (LOCAL=NO)
 7098 oracle    18   0  150g  32m  27m R 94.6  0.0   2:20.22 oracleq9db1 (LOCAL=NO)
 6874 oracle    17   0  150g  34m  28m R 87.0  0.0  12:02.92 oracleq9db1 (LOCAL=NO)
18206 oracle    16   0  150g  34m  27m R 86.1  0.0  16:11.28 oracleq9db1 (LOCAL=NO)
 7096 oracle    17   0  150g  29m  24m R 85.4  0.0   3:01.72 oracleq9db1 (LOCAL=NO)

数据库等待事件

       SID EVENT
---------- ----------------------------------------------------------------
       183 gc cr request
       185 latch: cache buffers chains
       239 db file sequential read
       292 gc cr request
       406 gc cr request
       410 db file sequential read
       463 gc current request
       572 gc buffer busy acquire
       575 gc buffer busy acquire
       577 latch: cache buffers chains
       629 db file sequential read
       747 gc cr request
       919 latch: cache buffers chains
       974 gc cr request
      1033 log file sync
      1141 db file parallel write
      1153 gc cr request
      1199 db file sequential read
      1378 db file sequential read
      1495 gc cr request
      1540 db file parallel write
      1547 gc buffer busy acquire
      1662 gc cr request
      1715 gc buffer busy acquire
      1770 SQL*Net message to client
      1830 latch: cache buffers chains
      1884 gc cr request
      2113 db file sequential read
      2173 db file sequential read
      2229 rdbms ipc reply
      2292 db file sequential read
      2341 db file sequential read
      2348 gc cr request
      2460 gc cr request
      2632 gc cr request
      2684 gc cr request
      2687 db file sequential read
      2749 db file sequential read
      2913 gc cr request
      2967 db file sequential read
      3038 gc cr request
      3087 SQL*Net message to client
      3089 gc cr request
      3194 db file sequential read
      3195 db file sequential read
      3309 latch: cache buffers chains
      3371 gc cr request
      3485 gc cr request
      3535 gc cr request

可以这里有很多gc cr request等待和cache buffers chains等待,第一反应就是很可能系统某条sql执行计划不正确导致逻辑读剧增.

分析awr报告
故障时候的逻辑读 top sql可以看出来cdwjdd67x27mh的sql逻辑读异常大
awr1
分析正常时间点awr报告中cdwjdd67x27mh逻辑读情况
awr2
通过对比可以发现两个awr中,cdwjdd67x27mh 语句执行次数差不多,但是单次逻辑读从800多突变为34000多,增加了40多倍
cdwjdd67x27mh 语句分析
执行计划

17:36:08 sys@Q9DB>select * from table(dbms_xplan.display_awr('cdwjdd67x27mh'));
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID cdwjdd67x27mh
--------------------
select * from ( select          *                               from                            GE_TAOBAO_BILL           o
                 WHERE 1=1                       and
o.CREATED_TIME >= :1            and o.CREATED_TIME < :2
                                and o.REC_SITE_ID = :3
             and o.STATUS_ID = :4
             and o.SERVICE_TYPE = :5 ) where rownum
<= 36000
Plan hash value: 647855111
----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                           |       |       |     5 (100)|          |        |
|   1 |  COUNT STOPKEY                       |                           |       |       |            |          |        |
|   2 |   FILTER                             |                           |       |       |            |          |        |
|   3 |    PARTITION RANGE ITERATOR          |                           |     1 |   455 |     5   (0)| 00:00:01 |   KEY |   KEY |
|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID| GE_TAOBAO_BILL            |     1 |   455 |     5   (0)| 00:00:01 |   KEY |   KEY |
|   5 |      INDEX RANGE SCAN                | IDX_TAOBAO_BILL_CR_ISDP_S |     1 |       |     4   (0)| 00:00:01 |   KEY |   KEY |
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID cdwjdd67x27mh
--------------------
select * from ( select          *                               from                            GE_TAOBAO_BILL           o
                 WHERE 1=1                       and
o.CREATED_TIME >= :1            and o.CREATED_TIME < :2
                                and o.REC_SITE_ID = :3
             and o.STATUS_ID = :4
             and o.SERVICE_TYPE = :5 ) where rownum
<= 36000
Plan hash value: 2979024279
---------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                          |       |       |    13 (100)|          |       ||
|   1 |  COUNT STOPKEY                       |                          |       |       |            |          |       ||
|   2 |   FILTER                             |                          |       |       |            |          |       ||
|   3 |    PARTITION RANGE ITERATOR          |                          |     1 |   455 |    13   (0)| 00:00:01 |   KEY |   KEY |
|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID| GE_TAOBAO_BILL           |     1 |   455 |    13   (0)| 00:00:01 |   KEY |   KEY |
|   5 |      INDEX RANGE SCAN                | IDX_TAOBAO_BILL_CR_REC_S |     4 |       |     9   (0)| 00:00:01 |   KEY |   KEY |
---------------------------------------------------------------------------------------------------------------------------------

这里可以发现cdwjdd67x27mh在数据库中有两个执行计划

通过awr数据,看执行计划的变化情况

10:50:29 sys@Q9DB>select a.INSTANCE_NUMBER,a.snap_id,a.sql_id,a.plan_hash_value,b.begin_interval_time
10:50:30   2  from dba_hist_sqlstat a, dba_hist_snapshot b
10:50:30   3  where sql_id ='cdwjdd67x27mh'
10:50:30   4  and a.snap_id = b.snap_id
10:50:30   5  order by instance_number,begin_interval_time  desc;
INSTANCE_NUMBER    SNAP_ID SQL_ID        PLAN_HASH_VALUE BEGIN_INTERVAL_TIME
--------------- ---------- ------------- --------------- ----------------------------------------
              1      20719 cdwjdd67x27mh      2979024279 12-MAY-14 05.00.12.753 PM
              1      20719 cdwjdd67x27mh       647855111 12-MAY-14 05.00.12.753 PM
              1      20719 cdwjdd67x27mh      2979024279 12-MAY-14 05.00.12.702 PM
              1      20719 cdwjdd67x27mh       647855111 12-MAY-14 05.00.12.702 PM
              1      20718 cdwjdd67x27mh       647855111 12-MAY-14 04.00.24.197 PM
              1      20718 cdwjdd67x27mh      2979024279 12-MAY-14 04.00.24.197 PM
              1      20718 cdwjdd67x27mh       647855111 12-MAY-14 04.00.24.172 PM
              1      20718 cdwjdd67x27mh      2979024279 12-MAY-14 04.00.24.172 PM
              1      20717 cdwjdd67x27mh       647855111 12-MAY-14 03.11.22.251 PM
              1      20717 cdwjdd67x27mh      2979024279 12-MAY-14 03.11.22.251 PM
              1      20717 cdwjdd67x27mh       647855111 12-MAY-14 03.11.22.188 PM
              1      20717 cdwjdd67x27mh      2979024279 12-MAY-14 03.11.22.188 PM
              ………………
              1      20696 cdwjdd67x27mh      2979024279 11-MAY-14 07.00.07.142 PM
              1      20696 cdwjdd67x27mh      2979024279 11-MAY-14 07.00.07.105 PM
              1      20695 cdwjdd67x27mh      2979024279 11-MAY-14 06.00.12.771 PM
              1      20695 cdwjdd67x27mh      2979024279 11-MAY-14 06.00.12.707 PM
              1      20694 cdwjdd67x27mh      2979024279 11-MAY-14 05.00.48.249 PM
              1      20694 cdwjdd67x27mh      2979024279 11-MAY-14 05.00.48.170 PM
              1      20693 cdwjdd67x27mh      2979024279 11-MAY-14 04.00.37.841 PM
              …………
              2      20719 cdwjdd67x27mh      2979024279 12-MAY-14 05.00.12.753 PM
              2      20719 cdwjdd67x27mh      2979024279 12-MAY-14 05.00.12.702 PM
              2      20718 cdwjdd67x27mh      2979024279 12-MAY-14 04.00.24.197 PM
              2      20718 cdwjdd67x27mh      2979024279 12-MAY-14 04.00.24.172 PM
              2      20717 cdwjdd67x27mh      2979024279 12-MAY-14 03.11.22.251 PM
              2      20717 cdwjdd67x27mh      2979024279 12-MAY-14 03.11.22.188 PM

这里可以清晰看到,执行计划在1节点中发生震荡(两个执行计划都有选择),仔细看两个执行计划,会发现就是查询的时候所使用的index不同而已,继续分析两个index

IDX_TAOBAO_BILL_CR_ISDP_S      CREATED_TIME                            1
IDX_TAOBAO_BILL_CR_ISDP_S      IS_DISPART                              2
IDX_TAOBAO_BILL_CR_ISDP_S      STATUS_ID                               3
IDX_TAOBAO_BILL_CR_REC_S       REC_SITE_ID                             1
IDX_TAOBAO_BILL_CR_REC_S       CREATED_TIME                            2
IDX_TAOBAO_BILL_CR_REC_S       STATUS_ID                               3

分析数据分布情况(其他列省略)

TABLE_NAME                     COLUMN_NAME                    NUM_DISTINCT LAST_ANAL
------------------------------ ------------------------------ ------------ ---------
GE_TAOBAO_BILL                 CREATED_TIME                      287080448 05-MAY-14
GE_TAOBAO_BILL                 REC_SITE_ID                           13176 05-MAY-14
GE_TAOBAO_BILL                 IS_DISPART                                2 05-MAY-14
GE_TAOBAO_BILL                 STATUS_ID                                 7 05-MAY-14

这里可以看到数据库在正常的时候使用IDX_TAOBAO_BILL_CR_REC_S没有问题,但是在某些情况下选择使用IDX_TAOBAO_BILL_CR_ISDP_S这个就有问题,导致逻辑读过高,其实对于该sql语句,是因为IDX_TAOBAO_BILL_CR_REC_S 索引不合理导致,如果创建CREATED_TIME,REC_SITE_ID,STATUS_ID,就不会因为传输的值范围不同而使用IDX_TAOBAO_BILL_CR_ISDP_S的情况。针对该情况,因为表非常大,短时间内无法修改index,只能考虑使用 sql profile固定执行计划

sql profile固定计划

SQL>@coe_xfr_sql_profile.sql cdwjdd67x27mh
Parameter 1:
SQL_ID (required)
PLAN_HASH_VALUE AVG_ET_SECS
--------------- -----------
     2979024279        .011
      647855111       5.164
Parameter 2:
PLAN_HASH_VALUE (required)
Enter value for 2: 2979024279
Values passed:
~~~~~~~~~~~~~
SQL_ID         : "cdwjdd67x27mh"
PLAN_HASH_VALUE: "2979024279"
Execute coe_xfr_sql_profile_cdwjdd67x27mh_2979024279.sql
on TARGET system in order to create a custom SQL Profile
with plan 2979024279 linked to adjusted sql_text.
COE_XFR_SQL_PROFILE completed.
SQL>@coe_xfr_sql_profile_cdwjdd67x27mh_2979024279.sql
SQL>DECLARE
  2  sql_txt CLOB;
  3  h       SYS.SQLPROF_ATTR;
  4  BEGIN
  5  sql_txt := q'[
  6  select * from ( select
  7                  *
  8
  9                  from
 10
 11                  GE_TAOBAO_BILL
 12
 13                  o
 14
 15                   WHERE 1=1
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27                                  and o.CREATED_TIME >= :1
 28
 29
 30                                  and o.CREATED_TIME < :2
 31
 32
 33                                  and o.REC_SITE_ID = :3
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49                                  and o.STATUS_ID = :4
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64                                  and o.SERVICE_TYPE = :5 ) where rownum <= 36000
 65  ]';
 66  h := SYS.SQLPROF_ATTR(
 67  q'[BEGIN_OUTLINE_DATA]',
 68  q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
 69  q'[OPTIMIZER_FEATURES_ENABLE('11.2.0.3')]',
 70  q'[DB_VERSION('11.2.0.3')]',
 71  q'[FIRST_ROWS]',
 72  q'[OUTLINE_LEAF(@"SEL$F5BB74E1")]',
 73  q'[MERGE(@"SEL$2")]',
 74  q'[OUTLINE(@"SEL$1")]',
 75  q'[OUTLINE(@"SEL$2")]',
 76  q'[INDEX_RS_ASC(@"SEL$F5BB74E1" "O"@"SEL$2" ("GE_TAOBAO_BILL"."REC_SITE_ID" "GE_TAOBAO_BILL"."CREATED_TIME" "GE_TAOBAO_BILL"."STATUS_ID"))]',
 77  q'[END_OUTLINE_DATA]');
 78  :signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
 79  DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
 80  sql_text    => sql_txt,
 81  profile     => h,
 82  name        => 'coe_cdwjdd67x27mh_2979024279',
 83  description => 'coe cdwjdd67x27mh 2979024279 '||:signature||'',
 84  category    => 'DEFAULT',
 85  validate    => TRUE,
 86  replace     => TRUE,
 87  force_match => FALSE /* TRUE:FORCE (match even when different literals in SQL). FALSE:EXACT (similar to CURSOR_SHARING) */ );
 88  END;
 89  /
PL/SQL procedure successfully completed.
SQL>WHENEVER SQLERROR CONTINUE
SQL>SET ECHO OFF;
            SIGNATURE
---------------------
 18414135509058398362
... manual custom SQL Profile has been created
COE_XFR_SQL_PROFILE_cdwjdd67x27mh_2979024279 completed

固定执行计划后,系统负载恢复正常
load恢复正常,单个进程占用cpu 也正常

top - 18:25:29 up 123 days,  4:25,  3 users,  load average: 17.59, 16.72, 16.10
Tasks: 1559 total,   6 running, 1551 sleeping,   0 stopped,   2 zombie
Cpu(s):  6.2%us,  1.2%sy,  0.0%ni, 84.3%id,  7.4%wa,  0.1%hi,  0.9%si,  0.0%st
Mem:  264253752k total, 262395300k used,  1858452k free,   305928k buffers
Swap: 33554424k total,   467420k used, 33087004k free, 66811412k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 3010 oracle    15   0  150g  26m  21m S 23.6  0.0   0:08.60 oracleq9db2 (LOCAL=NO)
26936 oracle    15   0  150g  33m  28m S 22.9  0.0   0:12.53 oracleq9db2 (LOCAL=NO)
 9691 oracle    15   0  150g  34m  28m D 21.0  0.0   0:48.73 oracleq9db2 (LOCAL=NO)
 9694 oracle    15   0  150g  34m  28m S 17.4  0.0   0:45.74 oracleq9db2 (LOCAL=NO)
14366 oracle    15   0  150g  33m  28m R 17.4  0.0   0:25.11 oracleq9db2 (LOCAL=NO)
 2471 oracle    -2   0  150g 180m  37m S 16.7  0.1  10795:55 ora_lms3_q9db2
 2463 oracle    -2   0  150g 180m  37m S 15.7  0.1  10648:38 ora_lms1_q9db2
 2459 oracle    -2   0  150g 180m  37m S 14.8  0.1  10726:42 ora_lms0_q9db2
 2467 oracle    -2   0  150g 180m  36m S 14.8  0.1  10980:33 ora_lms2_q9db2
13324 oracle    15   0  150g  31m  26m S 14.1  0.0   0:22.75 oracleq9db2 (LOCAL=NO)
16740 oracle    15   0  150g  34m  28m R 13.4  0.0   0:43.85 oracleq9db2 (LOCAL=NO)
 1908 oracle    15   0  150g  27m  22m S 11.8  0.0   0:03.95 oracleq9db2 (LOCAL=NO)
 9689 oracle    15   0  150g  33m  27m S 11.8  0.0   0:46.01 oracleq9db2 (LOCAL=NO)
19410 oracle    15   0  150g  31m  26m S 11.8  0.0   0:43.18 oracleq9db2 (LOCAL=NO)
14102 oracle    15   0  150g  31m  25m S 11.5  0.0   2:17.23 oracleq9db2 (LOCAL=NO)
 1914 oracle    15   0  150g  29m  24m S 11.1  0.0   0:04.77 oracleq9db2 (LOCAL=NO)
31106 oracle    15   0  150g  34m  28m S  9.8  0.0   1:24.85 oracleq9db2 (LOCAL=NO)
31139 oracle    15   0  150g  30m  24m S  9.8  0.0   1:21.75 oracleq9db2 (LOCAL=NO)
 2498 oracle    15   0  150g  42m  35m S  7.9  0.0   3838:43 ora_lgwr_q9db2
28108 oracle    15   0  150g  36m  29m S  7.9  0.0   0:18.19 oracleq9db2 (LOCAL=NO)
 2392 oracle    15   0  150g  35m  17m S  7.5  0.0   5304:57 ora_lmd0_q9db2

exp dmp文件损坏恢复

在有些时候,exp的dmp文件因为某种原因损坏(比如磁盘异常,exp过程损坏等),导致imp导入无法继续,下面的处理方法(直接读取dmp文件)来对dmp文件进行抢救性恢复,最大程度减少数据丢失损失
创建exp dmp文件并使用dd破坏

SQL> create table t_xifenfei as select * from dba_objects;
Table created.
SQL> select count(*) from t_xifenfei;
  COUNT(*)
----------
     90915
[oracle@localhost ~]$ exp chf/xifenfei@pdb1 file=/tmp/t_xifenfei.dmp tables=t_xifenfei
Export: Release 12.1.0.2.0  on Sun Apr 27 21:39:26 2014
Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
Export done in US7ASCII character set and AL16UTF16 NCHAR character set
server uses WE8MSWIN1252 character set (possible charset conversion)
About to export specified tables via Conventional Path ...
. . exporting table                     T_XIFENFEI      90915 rows exported
EXP-00091: Exporting questionable statistics.
Export terminated successfully with warnings.
[oracle@localhost ~]$ od -x /tmp/t_xifenfei.dmp |head -10
0000000 0003 4501 5058 524f 3a54 3156 2e32 3130
0000020 302e 0a30 4344 4648 520a 4154 4c42 5345
0000040 380a 3931 0a32 0a30 3237 300a 000a 0001
0000060 07b2 00d0 0001 0000 0000 0000 0000 0013
0000100 2020 2020 2020 2020 2020 2020 2020 2020
*
0000140 2020 2020 2020 2020 7553 206e 7041 2072
0000160 3732 3220 3a31 3933 323a 2036 3032 3431
0000200 742f 706d 742f 785f 6669 6e65 6566 2e69
0000220 6d64 0070 0000 0000 0000 0000 0000 0000
--strings命令看dmp文件
[oracle@localhost ~]$ strings /tmp/t_xifenfei.dmp |head -50
EXPORT:V12.01.00
DCHF
RTABLES
8192
                                         Tue Apr 29 0:39:49 2014/tmp/t_xifenfei.dmp
#G#G
#G#G
+08:00
BYTE
UNUSED
INTERPRETED
DISABLE:ALL
METRICST
TABLE "T_XIFENFEI"
CREATE TABLE "T_XIFENFEI" ("OWNER" VARCHAR2(128), "OBJECT_NAME" VARCHAR2(128), "SUBOBJECT_NAME" VARCHAR2(128), "OBJECT_ID" NUMBER, "DATA_OBJECT_ID" NUMBER, "OBJECT_TYPE" VARCHAR2(23), "CREATED" DATE, "LAST_DDL_TIME" DATE, "TIMESTAMP" VARCHAR2(19), "STATUS" VARCHAR2(7), "TEMPORARY" VARCHAR2(1), "GENERATED" VARCHAR2(1), "SECONDARY" VARCHAR2(1), "NAMESPACE" NUMBER, "EDITION_NAME" VARCHAR2(128), "SHARING" VARCHAR2(13), "EDITIONABLE" VARCHAR2(1), "ORACLE_MAINTAINED" VARCHAR2(1))  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 STORAGE(INITIAL 13631488 NEXT 1048576 MINEXTENTS 1 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE "USERS" LOGGING NOCOMPRESS
INSERT INTO "T_XIFENFEI" ("OWNER", "OBJECT_NAME", "SUBOBJECT_NAME", "OBJECT_ID", "DATA_OBJECT_ID", "OBJECT_TYPE", "CREATED", "LAST_DDL_TIME", "TIMESTAMP", "STATUS", "TEMPORARY", "GENERATED", "SECONDARY", "NAMESPACE", "EDITION_NAME", "SHARING", "EDITIONABLE", "ORACLE_MAINTAINED") VALUES (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11, :12, :13, :14, :15, :16, :17, :18)
PUBLIC
V$DATAGUARD_CONFIG
SYNONYM
2014-04-22:17:51:05
VALID
METADATA LINK
V_$DATAGUARD_STATS
VIEW
2014-04-22:17:51:05
--破坏exp dmp文件
[oracle@localhost ~]$ dd if=/dev/zero of=/tmp/t_xifenfei.dmp bs=1024 count=1 conv=notrunc
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 6.0291e-05 seconds, 17.0 MB/s
[oracle@localhost ~]$ od -x /tmp/t_xifenfei.dmp |head -10
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0020000 0064 0000 6000 2401 050f 0c0b 0c03 050c
0020020 0504 060d 0709 0508 0505 0505 050f 0505
0020040 0505 050a 0505 0505 0504 0706 0808 4723
0020060 4723 1108 0823 4111 47b0 8300 b200 d007
0020100 0003 0000 0000 0000 0000 0000 0000 0000
0020120 0000 0000 0000 0000 0000 0000 0000 0000
0020140 0000 0000 0000 0064 0000 6000 2401 050f
0020160 0c0b 0c03 050c 0504 060d 0709 0508 0505
--损坏后的dmp文件使用strings命令看
[oracle@localhost ~]$ strings /tmp/t_xifenfei.dmp |head -50
#G#G
#G#G
+08:00
BYTE
UNUSED
INTERPRETED
DISABLE:ALL
METRICST
TABLE "T_XIFENFEI"
CREATE TABLE "T_XIFENFEI" ("OWNER" VARCHAR2(128), "OBJECT_NAME" VARCHAR2(128), "SUBOBJECT_NAME" VARCHAR2(128), "OBJECT_ID" NUMBER, "DATA_OBJECT_ID" NUMBER, "OBJECT_TYPE" VARCHAR2(23), "CREATED" DATE, "LAST_DDL_TIME" DATE, "TIMESTAMP" VARCHAR2(19), "STATUS" VARCHAR2(7), "TEMPORARY" VARCHAR2(1), "GENERATED" VARCHAR2(1), "SECONDARY" VARCHAR2(1), "NAMESPACE" NUMBER, "EDITION_NAME" VARCHAR2(128), "SHARING" VARCHAR2(13), "EDITIONABLE" VARCHAR2(1), "ORACLE_MAINTAINED" VARCHAR2(1))  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 STORAGE(INITIAL 13631488 NEXT 1048576 MINEXTENTS 1 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE "USERS" LOGGING NOCOMPRESS
INSERT INTO "T_XIFENFEI" ("OWNER", "OBJECT_NAME", "SUBOBJECT_NAME", "OBJECT_ID", "DATA_OBJECT_ID", "OBJECT_TYPE", "CREATED", "LAST_DDL_TIME", "TIMESTAMP", "STATUS", "TEMPORARY", "GENERATED", "SECONDARY", "NAMESPACE", "EDITION_NAME", "SHARING", "EDITIONABLE", "ORACLE_MAINTAINED") VALUES (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11, :12, :13, :14, :15, :16, :17, :18)
PUBLIC
V$DATAGUARD_CONFIG
SYNONYM
2014-04-22:17:51:05
VALID
METADATA LINK
V_$DATAGUARD_STATS
--imp 导入dmp文件失败
[oracle@localhost ~]$ imp chf/xifenfei@pdb1 file=/tmp/t_xifenfei.dmp full=y
Import: Release 12.1.0.2.0 -      on Sun Apr 27 22:02:40 2014
Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
IMP-00037: Character set marker unknown
IMP-00000: Import terminated unsuccessfully

这里通过分析可以知道,exp dmp文件虽然损坏了一点,但是通过strings命令看,相关记录依然存在,因此可以通过工具去读exp dmp文件,然后分析得出相关数据

恢复损坏exp dmp文件数据

CPFL> SEARCH TABLE T_XIFENFEI FROM EXPFILE  /tmp/t_xifenfei.dmp
8461: TABLE "T_XIFENFEI"
8480: CREATE TABLE "T_XIFENFEI" ("OWNER" VARCHAR2(128), "OBJECT_NAME" VARCHAR2(128), "SUBOBJECT_NAME" VARCHAR2(128), "OBJECT_ID" NUMBER, "DATA_OBJECT_ID" NUMBER, "OBJECT_TYPE" VARCHAR2(23), "CREATED" DATE, "LAST_DDL_TIME" DATE, "TIMESTAMP" VARCHAR2(19), "STATUS" VARCHAR2(7), "TEMPORARY" VARCHAR2(1), "GENERATED" VARCHAR2(1), "SECONDARY" VARCHAR2(1), "NAMESPACE" NUMBER, "EDITION_NAME" VARCHAR2(128), "SHARING" VARCHAR2(13), "EDITIONABLE" VARCHAR2(1), "ORACLE_MAINTAINED" VARCHAR2(1))  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 STORAGE(INITIAL 8388608 NEXT 1048576 MINEXTENTS 1 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE "USERS" LOGGING NOCOMPRESS
9145: INSERT INTO "T_XIFENFEI" ("OWNER", "OBJECT_NAME", "SUBOBJECT_NAME", "OBJECT_ID", "DATA_OBJECT_ID", "OBJECT_TYPE", "CREATED", "LAST_DDL_TIME", "TIMESTAMP", "STATUS", "TEMPORARY", "GENERATED", "SECONDARY", "NAMESPACE", "EDITION_NAME", "SHARING", "EDITIONABLE", "ORACLE_MAINTAINED") VALUES (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11, :12, :13, :14, :15, :16, :17, :18)
Conventional export
9644: start of table data
12331252: TABLE "T_XIFENFEI"
12331349: ENDTABLE
CPFL> UNLOAD  TABLE "T_XIFENFEI" ("OWNER" VARCHAR2(128), "OBJECT_NAME" VARCHAR2(128), "SUBOBJECT_NAME" VARCHAR2(128), "OBJECT_ID" NUMBER, "DATA_OBJECT_ID" NUMBER, "OBJECT_TYPE" VARCHAR2(23), "CREATED" DATE, "LAST_DDL_TIME" DATE, "TIMESTAMP" VARCHAR2(19), "STATUS" VARCHAR2(7), "TEMPORARY" VARCHAR2(1), "GENERATED" VARCHAR2(1), "SECONDARY" VARCHAR2(1), "NAMESPACE" NUMBER, "EDITION_NAME" VARCHAR2(128), "SHARING" VARCHAR2(13), "EDITIONABLE" VARCHAR2(1), "ORACLE_MAINTAINED" VARCHAR2(1))  expfile  /tmp/t_xifenfei.dmp from 8480 until 12331349
--因为exp dmp文件损坏记录
CPFL: Error: column 1 length 21059 exceeds max bind size 128
0000000000 45415445 20544142 4c452022 545f5849 EATE  TAB LE " T_XI
0000000016 46454e46 45492220 28224f57 4e455222 FENF EI"  ("OW NER"
0000000032 20564152 43484152                    VAR CHAR
8480: column 1 type VARCHAR2 size 21059 failed
8480: row 1 failed
row conversion failure, retrying from offset 8481
CPFL: Error: Zero (illegal) length column number 2
…………
CPFL: Error: Zero (illegal) length column number 1
9644: succesful conversion      1164 bytes skipped due to conversion problems
131877: row 1000 ok
253310: row 2000 ok
…………
12200617: row 90000 ok
Unloaded 90915 rows, end of table marker at 12322835
[oracle@localhost CPFL]$ ls -ltr T_XIFENFEI.*
-rw-r--r-- 1 oracle oinstall 17230747 Apr 27 22:12 T_XIFENFEI.dat
-rw-r--r-- 1 oracle oinstall     1489 Apr 27 22:17 T_XIFENFEI.ctl

导入数据并对比

SQL> create table t_xifenfei_exp as select * from t_xifenfei where 1=0;
Table created.
[oracle@localhost CPFL]$ more T_XIFENFEI.ctl
load data
CHARACTERSET UTF8
infile 'T_XIFENFEI.dat'
insert
into table "T_XIFENFEI_EXP"  ---修改为T_XIFENFEI_EXP表
fields terminated by whitespace
(
  "OWNER"                            CHAR(128) enclosed by X'7C'
 ,"OBJECT_NAME"                      CHAR(128) enclosed by X'7C'
 ,"SUBOBJECT_NAME"                   CHAR(29) enclosed by X'7C'
 ,"OBJECT_ID"                        CHAR(5) enclosed by X'7C'
 ,"DATA_OBJECT_ID"                   CHAR(5) enclosed by X'7C'
 ,"OBJECT_TYPE"                      CHAR(20) enclosed by X'7C'
 ,"CREATED"                          DATE "DD-MON-YYYY AD HH24:MI:SS" enclosed by X'7C'
 ,"LAST_DDL_TIME"                    DATE "DD-MON-YYYY AD HH24:MI:SS" enclosed by X'7C'
 ,"TIMESTAMP"                        CHAR(19) enclosed by X'7C'
 ,"STATUS"                           CHAR(5) enclosed by X'7C'
 ,"TEMPORARY"                        CHAR(1) enclosed by X'7C'
 ,"GENERATED"                        CHAR(1) enclosed by X'7C'
 ,"SECONDARY"                        CHAR(1) enclosed by X'7C'
 ,"NAMESPACE"                        CHAR(2) enclosed by X'7C'
 ,"EDITION_NAME"                     CHAR(1) enclosed by X'7C'
 ,"SHARING"                          CHAR(13) enclosed by X'7C'
 ,"EDITIONABLE"                      CHAR(1) enclosed by X'7C'
 ,"ORACLE_MAINTAINED"                CHAR(1) enclosed by X'7C'
 ,"UNEXP_STATUS"                     FILLER CHAR(3) enclosed by X'7C'
)
[oracle@localhost CPFL]$ sqlldr chf/xifenfei@pdb1 control=T_XIFENFEI.ctl
SQL*Loader: Release 12.1.0.1.0       on Sun Apr 27 22:17:54 2014
Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.
Path used:      Conventional
Commit point reached - logical record count 64
Commit point reached - logical record count 128
Commit point reached - logical record count 192
…………
Commit point reached - logical record count 90887
Commit point reached - logical record count 90915
Table "T_XIFENFEI_EXP":
  90915 Rows successfully loaded.
Check the log file:
  T_XIFENFEI.log
for more information about the load.
[oracle@localhost CPFL]$ sqlplus chf/xifenfei@pdb1
SQL*Plus: Release 12.1.0.2.0 Beta on Sun Apr 27 22:18:08 2014
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
Last Successful login time: Sun Apr 27 2014 22:17:54 +08:00
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
SQL> select count(*) from t_xifenfei_exp;
  COUNT(*)
----------
     90915
SQL> select * from t_xifenfei
  2  minus
  3  select * from t_xifenfei_exp;
no rows selected

通过这里可以看出来,在exp dmp文件有部分损坏的情况下,还是可以通过直接读取dmp文件的方式恢复全部或者部分exp dmp文件中内容(具体恢复量取决于dmp文件损坏程度)

如果你在使用这些思路进行恢复遇到突发情况不能自行解决,请联系我们(ORACLE数据库恢复技术支持),将为您提供专业数据库技术支持:
Phone:17813235971    Q Q:107644445    E-Mail:dba@xifenfei.com

ORACLE 12C In-Memory功能性能测试

启用In-Memory功能
数据库版本12.1.0.2及其以上版本,inmemory_size参数设置为合适值

SQL> SELECT * FROM V$VERSION;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit                         0
PL/SQL Release 12.1.0.2.0 -                                                               0
CORE    12.1.0.2.0                                                                        0
TNS for Linux: Version 12.1.0.2.0 -                                                       0
NLSRTL Version 12.1.0.2.0 -                                                               0
SQL> SHOW PARAMETER inmemory;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default              string
inmemory_force                       string      DEFAULT
inmemory_query                       string      ENABLE
inmemory_size                        big integer 200M

创建表
这里可以知道,创建表大小为13631488,但是未使用In-Memory功能

SQL> create table t_xifenfei_in_memory as select * from dba_objects;
Table created.
SQL> SELECT BYTES FROM USER_SEGMENTS WHERE SEGMENT_NAME='T_XIFENFEI_IN_MEMORY';
     BYTES
----------
  13631488
SQL> select TABLE_NAME,INMEMORY_PRIORITY,INMEMORY_DISTRIBUTE,INMEMORY_COMPRESSION from user_tables;
TABLE_NAME                     INMEMORY INMEMORY_DISTRI INMEMORY_COMPRESS
------------------------------ -------- --------------- -----------------
T_XIFENFEI_IN_MEMORY
SQL>  SELECT * FROM V$INMEMORY_AREA;
POOL                       ALLOC_BYTES USED_BYTES POPULATE_STATUS                CON_ID
-------------------------- ----------- ---------- -------------------------- ----------
1MB POOL                     166723584          0 DONE                                3
64KB POOL                     33554432          0 DONE                                3

未使用In-Memory功能测试

SQL> SET AUTOT TRACE
SQL> SELECT * FROM T_XIFENFEI_IN_MEMORY;
90902 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3598036702
------------------------------------------------------------------------------------------
| Id  | Operation         | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                      | 90902 |     9M|   427   (1)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| T_XIFENFEI_IN_MEMORY | 90902 |     9M|   427   (1)| 00:00:01 |
------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
       7505  consistent gets
       1527  physical reads
          0  redo size
   12125231  bytes sent via SQL*Net to client
      67212  bytes received via SQL*Net from client
       6062  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
      90902  rows processed

这里可以看到未使用In-Memory功能,数据库查询执行计划使用TABLE ACCESS FULL,consistent gets为7505

使用In-Memory功能测试

SQL>  alter table  T_XIFENFEI_IN_MEMORY inmemory;
Table altered.
SQL> select TABLE_NAME,INMEMORY_PRIORITY,INMEMORY_DISTRIBUTE,INMEMORY_COMPRESSION from user_tables;
TABLE_NAME                     INMEMORY INMEMORY_DISTRI INMEMORY_COMPRESS
------------------------------ -------- --------------- -----------------
T_XIFENFEI_IN_MEMORY           NONE     AUTO DISTRIBUTE FOR QUERY
--因为只是把该表设置了INMEMORY,但是未查询过,所以查询V$INMEMORY_AREA中未使用相关内存
SQL> SELECT * FROM V$INMEMORY_AREA;
POOL                       ALLOC_BYTES USED_BYTES POPULATE_STATUS                CON_ID
-------------------------- ----------- ---------- -------------------------- ----------
1MB POOL                     166723584          0 DONE                                3
64KB POOL                     33554432          0 DONE                                3
--进行一次全表扫描
SQL> SELECT COUNT(*) FROM T_XIFENFEI_IN_MEMORY;
  COUNT(*)
----------
     90902
--再次查看,已经使用了分配的In-Memory中内存
SQL> SELECT * FROM V$INMEMORY_AREA;
POOL                       ALLOC_BYTES USED_BYTES POPULATE_STATUS                CON_ID
-------------------------- ----------- ---------- -------------------------- ----------
1MB POOL                     166723584    4194304 DONE                                3
64KB POOL                     33554432     131072 DONE                                3
SQL> SET AUTOT TRACE
SQL> SELECT * FROM T_XIFENFEI_IN_MEMORY;
90902 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3598036702
---------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                      | 90902 |     9M|    20  (45)| 00:00:01 |
|   1 |  TABLE ACCESS INMEMORY FULL| T_XIFENFEI_IN_MEMORY | 90902 |     9M|    20  (45)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
          3  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
    4946298  bytes sent via SQL*Net to client
      67212  bytes received via SQL*Net from client
       6062  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      90902  rows processed

这里我们可以发现,使用了In-Memory功能之后,数据库consistent gets为4,相比未使用In-Memory之前的7505,性能最少提高近2000倍.

ORACLE 12C In-Memory组件初试

根据Oracle官方的宣传12.1.0.2的In-Memory组件实现内存列存储提高Oracle性能而且弥补在列存储中的不足。感谢Lunar的文档支持
12.1.0.2版本

[oracle@localhost ~]$ sqlplus chf/xifenfei@pdb1
SQL*Plus: Release 12.1.0.2.0 Beta on Thu Apr 24 21:39:43 2014
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
SQL> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit                         0
PL/SQL Release 12.1.0.2.0 -                                                               0
CORE    12.1.0.2.0                                                                        0
TNS for Linux: Version 12.1.0.2.0 -                                                       0
NLSRTL Version 12.1.0.2.0 -                                                               0

关于In-Memory组件

SQL>  select parameter,value from v$option where parameter like 'In-Memory%';
PARAMETER                                                        VALUE
---------------------------------------------------------------- -------------
In-Memory Column Store                                           TRUE
In-Memory Aggregation                                            TRUE

关于inmemory参数

SQL> select NAME,value,DESCRIPTION from v$parameter where NAME like 'inmemory%';
NAME                 VALUE
-------------------- --------------------
DESCRIPTION
---------------------------------------------------
inmemory_size        0
size in bytes of in-memory area
inmemory_clause_defa
ult
Default in-memory clause for new tables
inmemory_force       DEFAULT
Force tables to be in-memory or not
inmemory_query       ENABLE
Specifies whether in-memory queries are allowed

启用In-Memory功能

SQL> show parameter inmemory;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default              string
inmemory_force                       string      DEFAULT
inmemory_query                       string      ENABLE
inmemory_size                        big integer 0
SQL> alter system set inmemory_size=200M scope=spfile;
System altered.
SQL> shutdown immediate
Pluggable Database closed.
SQL> startup
Pluggable Database opened.
SQL> show parameter inmemory;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------------
inmemory_clause_default              string
inmemory_force                       string      DEFAULT
inmemory_query                       string      ENABLE
inmemory_size                        big integer 200M

创建测试表

SQL> create table t_xifenfei_in_memory as select * from dba_objects;
Table created.
SQL> select TABLE_NAME,INMEMORY_PRIORITY,INMEMORY_DISTRIBUTE,INMEMORY_COMPRESSION from user_tables;
TABLE_NAME                     INMEMORY INMEMORY_DISTRI INMEMORY_COMPRESS
------------------------------ -------- --------------- -----------------
T_XIFENFEI_IN_MEMORY
SQL> alter table  T_XIFENFEI_IN_MEMORY inmemory;
Table altered.
SQL> select TABLE_NAME,INMEMORY_PRIORITY,INMEMORY_DISTRIBUTE,INMEMORY_COMPRESSION from user_tables;
TABLE_NAME                INMEMORY INMEMORY_DISTRI INMEMORY_COMPRESS
------------------------- -------- --------------- -----------------
T_XIFENFEI_IN_MEMORY      NONE     AUTO DISTRIBUTE FOR QUERY

asm disk header 彻底损坏恢复

在asm 磁盘组不能mount的情况下,如果是磁盘头的少数部分损坏,或者是asm disk header存在,可以通过kfed修复,或者使用备份的磁盘头信息去恢复从而实现磁盘组mount来恢复数据库.如果没有备份也无法修复可以尝试使用amdu,dul来实现对不能mount的磁盘组进行恢复.在极端情况下(比如磁盘组完全丢失),amdu/dul都无论为力的情况下,可以考虑使用扫描磁盘找出来datafile 的方法求救数据的最后稻草.本实验大概的模拟了asm disk 前10M完全损坏的情况下数据库恢复
测试准备
创建新表空间,创建T_XIFENFEI测试表

SQL> create tablespace xifenfei datafile '+XIFENFEI' SIZE 50m;
Tablespace created.
SQL> CREATE TABLE T_XIFENFEI TABLESPACE XIFENFEI
  2  AS SELECT * FROM DBA_OBJECTS;
Table created.
SQL> SELECT COUNT(*) FROM T_XIFENFEI;
  COUNT(*)
----------
     50031
SQL> select ts#,rfile#,bytes/1024/1024,blocks,name from v$datafile;
       TS#     RFILE# BYTES/1024/1024     BLOCKS NAME
---------- ---------- --------------- ---------- --------------------------------------------------
         0          1             480      61440 +XIFENFEI/asm10g/datafile/system.256.845260203
         1          2              25       3200 +XIFENFEI/asm10g/datafile/undotbs1.258.845260205
         2          3             250      32000 +XIFENFEI/asm10g/datafile/sysaux.257.845260203
         4          4               5        640 +XIFENFEI/asm10g/datafile/users.259.845260205
         6          5              50       6400 +XIFENFEI/asm10g/datafile/xifenfei.266.845262139
SQL> select GROUP_NUMBER,DISK_NUMBER,STATE,TOTAL_MB,FREE_MB,NAME,path from  v$asm_disk;
GROUP_NUMBER DISK_NUMBER STATE      TOTAL_MB    FREE_MB NAME                 PATH
------------ ----------- -------- ---------- ---------- -------------------- ------------------
           1           0 NORMAL         2048          0 XIFENFEI_0000        /dev/raw/raw1
           1           1 NORMAL          784          0 XIFENFEI_0001        /dev/raw/raw2
           1           2 NORMAL         7059          0 XIFENFEI_0002        /dev/raw/raw3
--关闭数据库
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
--关闭ASM
SQL> shutdown immediate
ASM diskgroups dismounted
ASM instance shutdown

查看裸设备对应磁盘

[oracle@xifenfei dul]$ more /etc/sysconfig/rawdevices
/dev/raw/raw1   /dev/sdc
/dev/raw/raw2   /dev/sdd1
/dev/raw/raw3   /dev/sdd2

dd磁盘头
dd asm disk 前面10M,彻底破坏asm disk

[oracle@xifenfei ~]$ dd if=/dev/zero of=/dev/raw/raw1 bs=1M count=10 conv=notrunc
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.175424 seconds, 59.8 MB/s
[oracle@xifenfei ~]$ dd if=/dev/zero of=/dev/raw/raw2 bs=1M count=10 conv=notrunc
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.11584 seconds, 90.5 MB/s
[oracle@xifenfei ~]$ dd if=/dev/zero of=/dev/raw/raw3 bs=1M count=10 conv=notrunc
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.353435 seconds, 29.7 MB/s

kfed查看磁盘
确定所有asm disk header完全被破坏

[oracle@xifenfei dul]$ kfed read /dev/raw/raw1
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj:                       0 ; 0x008: TYPE=0x0 NUMB=0x0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
[oracle@xifenfei dul]$ kfed read /dev/raw/raw2
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj:                       0 ; 0x008: TYPE=0x0 NUMB=0x0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
[oracle@xifenfei dul]$ kfed read /dev/raw/raw3
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj:                       0 ; 0x008: TYPE=0x0 NUMB=0x0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000

amdu查看asm 磁盘

[oracle@xifenfei ~]$ amdu -diskstring '/dev/raw/raw*'
amdu_2014_04_18_23_17_17/
[oracle@xifenfei ~]$ cd amdu_2014_04_18_23_17_17
[oracle@xifenfei amdu_2014_04_18_23_17_17]$ ls
report.txt
[oracle@xifenfei amdu_2014_04_18_23_17_17]$ more report.txt
-*-amdu-*-
…………
--------------------------------- Operations ---------------------------------
------------------------------- Disk Selection -------------------------------
 -diskstring '/dev/raw/raw*'
------------------------------ Reading Control -------------------------------
------------------------------- Output Control -------------------------------
********************************* DISCOVERY **********************************
----------------------------- DISK REPORT N0001 ------------------------------
                Disk Path: /dev/raw/raw1
           Unique Disk ID:
               Disk Label:
     Physical Sector Size: 512 bytes
                Disk Size: 65536 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **
----------------------------- DISK REPORT N0002 ------------------------------
                Disk Path: /dev/raw/raw2
           Unique Disk ID:
               Disk Label:
     Physical Sector Size: 512 bytes
                Disk Size: 65536 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **
----------------------------- DISK REPORT N0003 ------------------------------
                Disk Path: /dev/raw/raw3
           Unique Disk ID:
               Disk Label:
     Physical Sector Size: 512 bytes
                Disk Size: 65536 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **
******************************* END OF REPORT ********************************

通过这里证明,当asm disk header 损坏严重之时,amdu无法识别,更加无法恢复相关数据库

dul查看完全损坏asm disk header
测试在asm disk header完全损坏情况下,dul是否还能够实现asm磁盘组中抽取数据,同理amdu也无法正常工作.

[oracle@xifenfei dul]$ ./dul
Data UnLoader: 10.2.0.5.28 - Internal Only - on Sat Apr 19 04:02:02 2014
with 64-bit io functions
Copyright (c) 1994 2014 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
DUL: Warning: block 0 is not a disk header block
DUL: Error: Block is not in use
DUL: Error: Block type mismatch ( seen 0 expect 1) when parsing block 0 of disk /dev/raw/raw1
DUL: Warning: block 0 is not a disk header block
DUL: Error: Block is not in use
DUL: Error: Block type mismatch ( seen 0 expect 1) when parsing block 0 of disk /dev/raw/raw2
DUL: Warning: block 0 is not a disk header block
DUL: Error: Block is not in use
DUL: Error: Block type mismatch ( seen 0 expect 1) when parsing block 0 of disk /dev/raw/raw3

这里可以看出来,当asm disk header完全异常,dul也无法识别出来asm磁盘组(该情况下dul无法正常操作)

通过工具扫描磁盘抽取数据块

CPFL> scan disk  /dev/raw/raw1
Scanning  disk /dev/raw/raw1, at 2014-04-19 04:05:11
Completed  disk /dev/raw/raw1, at 2014-04-19 04:05:56
CPFL> scan  disk  /dev/raw/raw2
Scanning  disk /dev/raw/raw2, at 2014-04-19 04:05:56
Completed  disk /dev/raw/raw2, at 2014-04-19 04:06:15
CPFL> scan  disk  /dev/raw/raw3
Scanning  disk /dev/raw/raw3, at 2014-04-19 04:06:15
Completed  disk /dev/raw/raw3, at 2014-04-19 04:07:44
CPFL> list datafiles
 Tablespace: SYSTEM    File:    1  Blocks:      61440
 Tablespace: UNDOTBS1  File:    2  Blocks:       3200
 Tablespace: SYSAUX    File:    3  Blocks:      32000
 Tablespace: USERS     File:    4  Blocks:        640
 Tablespace: XIFENFEI  File:    5  Blocks:       6400
CPFL> copy datafile 1 to /u01/oracle/oradata/datafile/1.dbf
copy datafile start: 2014-04-19 04:10:35
copy datafile 1 have blocks 61440
copy datafile completed: 2014-04-19 04:11:18
CPFL> copy datafile 2  to /u01/oracle/oradata/datafile/2.dbf
copy datafile start: 2014-04-19 04:11:52
copy datafile 2 have blocks 3200
copy datafile completed: 2014-04-19 04:11:54
CPFL>  copy datafile 3  to /u01/oracle/oradata/datafile/3.dbf
copy datafile start: 2014-04-19 04:12:03
copy datafile 3 have blocks 32000
copy datafile completed: 2014-04-19 04:12:27
CPFL>  copy datafile 4  to /u01/oracle/oradata/datafile/4.dbf
copy datafile start: 2014-04-19 04:13:07
copy datafile 4 have blocks 640
copy datafile completed: 2014-04-19 04:13:08
CPFL> copy datafile 5 to /u01/oracle/oradata/datafile/5.dbf
copy datafile start: 2014-04-19 04:13:18
copy datafile 5 have blocks 6400
copy datafile completed: 2014-04-19 04:13:19

查看使用工具抽取数据文件

[oracle@xifenfei datafile]$ ls -l
total 830320
-rw-r--r-- 1 oracle oinstall 503324672 Apr 19 04:34 1.dbf
-rw-r--r-- 1 oracle oinstall  26222592 Apr 19 04:34 2.dbf
-rw-r--r-- 1 oracle oinstall 262152192 Apr 19 04:34 3.dbf
-rw-r--r-- 1 oracle oinstall   5251072 Apr 19 04:34 4.dbf
-rw-r--r-- 1 oracle oinstall  52436992 Apr 19 04:34 5.dbf

dul验证抽取文件

[oracle@xifenfei dul]$ ./dul
Data UnLoader: 10.2.0.5.28 - Internal Only - on Sat Apr 19 06:56:09 2014
with 64-bit io functions
Copyright (c) 1994 2014 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
DUL: Warning: Recreating file "dul.log"
Found db_id = 181793355
Found db_name = ASM10G
DUL> show datafiles;
ts# rf# start   blocks offs open  err file name
  0   1     0    61440    0    1    0 /u01/oracle/oradata/datafile/1.dbf
  1   2     0     3200    0    1    0 /u01/oracle/oradata/datafile/2.dbf
  2   3     0    32000    0    1    0 /u01/oracle/oradata/datafile/3.dbf
  4   4     0      640    0    1    0 /u01/oracle/oradata/datafile/4.dbf
  6   5     0     6400    0    1    0 /u01/oracle/oradata/datafile/5.dbf
DUL> bootstrap;
Probing file = 1, block = 377
. unloading table                BOOTSTRAP$
DUL: Warning: block number is non zero but marked deferred trying to process it anyhow
      57 rows unloaded
DUL: Warning: Dictionary cache DC_BOOTSTRAP is empty
Reading BOOTSTRAP.dat 57 entries loaded
Parsing Bootstrap$ contents
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 10
 OBJ$: segobjno 18, file 1 block 121
 TAB$: segobjno 2, tabno 1, file 1  block 25
 COL$: segobjno 2, tabno 5, file 1  block 25
 USER$: segobjno 10, tabno 1, file 1  block 89
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$   51171 rows unloaded
. unloading table                      TAB$    1576 rows unloaded
. unloading table                      COL$   55264 rows unloaded
. unloading table                     USER$      59 rows unloaded
Reading USER.dat 59 entries loaded
Reading OBJ.dat 51171 entries loaded and sorted 51171 entries
Reading TAB.dat 1576 entries loaded
Reading COL.dat 55264 entries loaded and sorted 55264 entries
Reading BOOTSTRAP.dat 57 entries loaded
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 10
 OBJ$: segobjno 18, file 1 block 121
 TAB$: segobjno 2, tabno 1, file 1  block 25
 COL$: segobjno 2, tabno 5, file 1  block 25
 USER$: segobjno 10, tabno 1, file 1  block 89
 TABPART$: segobjno 266, file 1 block 2121
 INDPART$: segobjno 271, file 1 block 2161
 TABCOMPART$: segobjno 288, file 1 block 2297
 INDCOMPART$: segobjno 293, file 1 block 2345
 TABSUBPART$: segobjno 278, file 1 block 2217
 INDSUBPART$: segobjno 283, file 1 block 2257
 IND$: segobjno 2, tabno 3, file 1  block 25
 ICOL$: segobjno 2, tabno 4, file 1  block 25
 LOB$: segobjno 2, tabno 6, file 1  block 25
 COLTYPE$: segobjno 2, tabno 7, file 1  block 25
 TYPE$: segobjno 181, tabno 1, file 1  block 1297
 COLLECTION$: segobjno 181, tabno 2, file 1  block 1297
 ATTRIBUTE$: segobjno 181, tabno 3, file 1  block 1297
 LOBFRAG$: segobjno 299, file 1 block 2393
 LOBCOMPPART$: segobjno 302, file 1 block 2425
 UNDO$: segobjno 15, file 1 block 105
 TS$: segobjno 6, tabno 2, file 1  block 57
 PROPS$: segobjno 96, file 1 block 721
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$
DUL: Warning: Recreating file "OBJ.ctl"
   51171 rows unloaded
. unloading table                      TAB$
DUL: Warning: Recreating file "TAB.ctl"
    1576 rows unloaded
. unloading table                      COL$
DUL: Warning: Recreating file "COL.ctl"
   55264 rows unloaded
. unloading table                     USER$
DUL: Warning: Recreating file "USER.ctl"
      59 rows unloaded
. unloading table                  TABPART$      72 rows unloaded
. unloading table                  INDPART$      80 rows unloaded
. unloading table               TABCOMPART$       0 rows unloaded
. unloading table               INDCOMPART$       0 rows unloaded
. unloading table               TABSUBPART$       0 rows unloaded
. unloading table               INDSUBPART$       0 rows unloaded
. unloading table                      IND$    2231 rows unloaded
. unloading table                     ICOL$    3650 rows unloaded
. unloading table                      LOB$     530 rows unloaded
. unloading table                  COLTYPE$    1701 rows unloaded
. unloading table                     TYPE$    1945 rows unloaded
. unloading table               COLLECTION$     555 rows unloaded
. unloading table                ATTRIBUTE$    7275 rows unloaded
. unloading table                  LOBFRAG$       1 row  unloaded
. unloading table              LOBCOMPPART$       0 rows unloaded
. unloading table                     UNDO$      21 rows unloaded
. unloading table                       TS$       7 rows unloaded
. unloading table                    PROPS$      28 rows unloaded
Reading USER.dat 59 entries loaded
Reading OBJ.dat 51171 entries loaded and sorted 51171 entries
Reading TAB.dat 1576 entries loaded
Reading COL.dat 55264 entries loaded and sorted 55264 entries
Reading TABPART.dat 72 entries loaded and sorted 72 entries
Reading TABCOMPART.dat 0 entries loaded and sorted 0 entries
Reading TABSUBPART.dat 0 entries loaded and sorted 0 entries
Reading INDPART.dat 80 entries loaded and sorted 80 entries
Reading INDCOMPART.dat 0 entries loaded and sorted 0 entries
Reading INDSUBPART.dat 0 entries loaded and sorted 0 entries
Reading IND.dat 2231 entries loaded
Reading LOB.dat 530 entries loaded
Reading ICOL.dat 3650 entries loaded
Reading COLTYPE.dat 1701 entries loaded
Reading TYPE.dat 1945 entries loaded
Reading ATTRIBUTE.dat 7275 entries loaded
Reading COLLECTION.dat 555 entries loaded
Reading BOOTSTRAP.dat 57 entries loaded
Reading LOBFRAG.dat 1 entries loaded and sorted 1 entries
Reading LOBCOMPPART.dat 0 entries loaded and sorted 0 entries
Reading UNDO.dat 21 entries loaded
Reading TS.dat 7 entries loaded
Reading PROPS.dat 28 entries loaded
Database character set is ZHS16GBK
Database national character set is AL16UTF16
DUL> unload table sys.t_xifenfei;
. unloading table                T_XIFENFEI   50031 rows unloaded

通过这里可以发现,我们创建测试数据为50031条,dul读取抽取出来数据文件中对应表数据条数也为50031条;证明:在asm disk header完全损坏情况下,amdu,dul无法直接恢复asm里面数据库,但是可以通过工具扫描数据文件,找出来磁盘中的datafile block实现完整恢复数据[只要你的asm中的数据没有覆盖,都可以通过该方法恢复]

如果你在使用这些思路进行恢复遇到突发情况不能自行解决,请联系我们(ORACLE数据库恢复技术支持),将为您提供专业数据库技术支持:
Phone:17813235971    Q Q:107644445    E-Mail:dba@xifenfei.com

通过修改控制文件scn推进数据库scn

在数据库遇到ora-600[2662],scn不一致(又没有日志)的时候,我们首先想到的就是去推进数据库的scn,让数据库能够open起来,抢救其中的数据,但是由于各种乱用的情况,oracle scn的pach出来后(11.2.0.4,12.0.1.0默认就屏蔽),屏蔽了以前大部分传统的推进scn的方法(adjust_scn, _minimum_giga_scn),现在能够推进scn的有oradebug,bbed,修改控制文件.本文就列举通过ue修改控制文件scn来推进数据库scn的方法.
数据库当前scn

idle> select   checkpoint_change# from v$database;
CHECKPOINT_CHANGE#
------------------
         271743118
idle> shutdown abort
ORACLE 例程已经关闭。

分析控制文件中scn
ctl-scn


这里我们可以看到加粗部分为数据库scn

SQL>select to_number('10327a59','xxxxxxxxx') from dual;
TO_NUMBER('10327A59','XXXXXXXXX')
---------------------------------
                        271743577

这里的scn值和在数据库中查询的值有小差别,因为查询时间点和我完全关闭数据库有个时间差,而这个时间差有scn变化.细红框部分为控制文件对块的验证信息

修改控制文件scn和验证信息
验证信息修改为全部0,scn信息你可以根据你的需求去修改,这里把数据库的scn修改为57253932971026,按照数据库的原理,启动后的scn应该稍微大于该scn值.
ctl-scn-new


SQL>select to_number('341278563412','xxxxxxxxxxxxxxxxx') from dual;
TO_NUMBER('341278563412','XXXXXXXXXXXXXXXXX')
---------------------------------------------
                               57253932971026

启动数据库

idle> startup mount
ORACLE 例程已经启动。
Total System Global Area  400846848 bytes
Fixed Size                  2440024 bytes
Variable Size             289408168 bytes
Database Buffers          100663296 bytes
Redo Buffers                8335360 bytes
数据库装载完毕。
idle> recover database;
完成介质恢复。
idle> alter database open;
数据库已更改。
idle> select   checkpoint_change# from v$database;
CHECKPOINT_CHANGE#
------------------
    57253932991028

数据库启动后查询scn为57253932991028(数据库当前scn)果然微大于57253932971026(修改控制文件scn),证明我们通过修改控制文件scn,实现数据库scn推近完全OK.不实验风险较大,请勿在生产环境上测试,负载后果自负

ORACLE 12C ORA-07445[ktuHistRecUsegCrtMain()+1173]恢复

在ORACLE 12C的使用中,遇到ORA-07445: 出现异常错误: 核心转储 [ktuHistRecUsegCrtMain()+1173]的错误,通过10046分析是在查询cdef$表的时候异常,导致数据库无法正常open,通过recover database,然后open成功,怀疑是oracle 12C bug,但是mos中未查询到相关记录,后续不知道是否会出现相关文档说明该问题.
数据库版本

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0
PL/SQL Release 12.1.0.1.0 - Production                                                    0
CORE    12.1.0.1.0      Production                                                        0
TNS for 64-bit Windows: Version 12.1.0.1.0 - Production                                   0
NLSRTL Version 12.1.0.1.0 - Production                                                    0

oracle 12c启动报错

idle> startup
ORACLE 例程已经启动。
Total System Global Area  400846848 bytes
Fixed Size                  2440024 bytes
Variable Size             281019560 bytes
Database Buffers          109051904 bytes
Redo Buffers                8335360 bytes
数据库装载完毕。
ORA-03113: 通信通道的文件结尾
进程 ID: 5972
会话 ID: 242 序列号: 3

alert日志报ORA-07445: 出现异常错误: 核心转储 [ktuHistRecUsegCrtMain()+1173]错

Sun Mar 30 22:35:22 2014
SMON: enabling cache recovery
Sun Mar 30 22:35:23 2014
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x6060] [PC:0x140808585, ktuHistRecUsegCrtMain()+1173]
Errors in file E:\APP\XIFENFEI\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_5192.trc  (incident=45790):
ORA-07445: 出现异常错误: 核心转储 [ktuHistRecUsegCrtMain()+1173] [ACCESS_VIOLATION] [ADDR:0x6060] [PC:0x140808585] [UNABLE_TO_READ] []
Incident details in: E:\APP\XIFENFEI\diag\rdbms\xifenfei\xifenfei\incident\incdir_45790\xifenfei_ora_5192_i45790.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sun Mar 30 22:35:28 2014
Instance Critical Process (pid: 22, ospid: 5192) died unexpectedly
PMON (ospid: 8420): terminating the instance due to error 12752

恢复数据库

idle> recover database;
完成介质恢复。
idle> alter database open;
数据库已更改。

ORACLE 12C redo异常恢复测试—打上patch恢复完全

在上一篇中,我们删除redo,然后通过非常规方法使得cdb open成功,但是pdb无法正常mount,通过查询mos确定是bug 16784143,具体见:ORACLE 12C redo异常恢复测试—部分pdb未正常open,通过给数据库打上12.1.0.3的patch之后,发现已经损坏的数据库的pdb无法正常open,但是在损坏之前如果是打上补丁的数据库其pdb可以正常open
patch信息

C:\Users\XIFENFEI>E:\oracle\product\11.2.0\dbhome_1\OPatch\opatch lspatches
17977915;WINDOWS DB BUNDLE PATCH 12.1.0.1.3 (64bit): (17977915)

session 1

XIFENFEI_CDB$ROOT@SYS> show con_name;
CON_NAME
------------------------------
CDB$ROOT
XIFENFEI_CDB$ROOT@SYS> create table t_xifenfei as select * from dba_users;
表已创建。
XIFENFEI_CDB$ROOT@SYS> delete from t_xifenfei;
已删除 35 行。
XIFENFEI_CDB$ROOT@SYS>

session 2

XIFENFEI_CDB$ROOT@SYS> show con_name;
CON_NAME
------------------------------
PDB1
XIFENFEI_CDB$ROOT@SYS> create table t_xifenfei as select * from dba_users;
表已创建。
XIFENFEI_CDB$ROOT@SYS> delete from t_xifenfei;
已删除 36 行。
XIFENFEI_CDB$ROOT@SYS>

session 3

XIFENFEI_CDB$ROOT@SYS> alter session set container=pdb2;
会话已更改。
XIFENFEI_CDB$ROOT@SYS> alter database open;
数据库已更改。
XIFENFEI_CDB$ROOT@SYS> show con_name;
CON_NAME
------------------------------
PDB2
XIFENFEI_CDB$ROOT@SYS>

session 4

XIFENFEI_CDB$ROOT@SYS> shutdown abort;
ORACLE 例程已经关闭。

删除所有redo,并启动数据库

C:\Users\XIFENFEI>sqlplus / as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on 星期日 3月 30 21:07:55 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
已连接到空闲例程。
idle> startup mount;
ORACLE 例程已经启动。
Total System Global Area  400846848 bytes
Fixed Size                  2440024 bytes
Variable Size             289408168 bytes
Database Buffers          100663296 bytes
Redo Buffers                8335360 bytes
数据库装载完毕。
idle> alter database open;
alter database open
*
第 1 行出现错误:
ORA-00313: 无法打开日志组 3 (用于线程 1) 的成员
ORA-00312: 联机日志 3 线程 1: 'E:\APP\XIFENFEI\ORADATA\XIFENFEI\REDO03.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件。
尝试恢复数据库

idle> select status,group# from v$log;
STATUS               GROUP#
---------------- ----------
INACTIVE                  1
CURRENT                   3
INACTIVE                  2
idle> recover database until cancel;
ORA-00279: 更改 2821739 (在 03/30/2014 20:58:39 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\XIFENFEI\FAST_RECOVERY_AREA\XIFENFEI\ARCHIVELOG\2014_03_30\O1_MF_1_36_%U_
.ARC
ORA-00280: 更改 2821739 (用于线程 1) 在序列 #36 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00308: 无法打开归档日志
'E:\APP\XIFENFEI\FAST_RECOVERY_AREA\XIFENFEI\ARCHIVELOG\2014_03_30\O1_MF_1_36_%U
_.ARC'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件。
ORA-00308: 无法打开归档日志
'E:\APP\XIFENFEI\FAST_RECOVERY_AREA\XIFENFEI\ARCHIVELOG\2014_03_30\O1_MF_1_36_%U
_.ARC'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件。
ORA-01547: 警告: RECOVER 成功但 OPEN RESETLOGS 将出现如下错误
ORA-01194: 文件 1 需要更多的恢复来保持一致性
ORA-01110: 数据文件 1: 'E:\APP\XIFENFEI\ORADATA\XIFENFEI\SYSTEM01.DBF'
idle> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-01194: 文件 1 需要更多的恢复来保持一致性
ORA-01110: 数据文件 1: 'E:\APP\XIFENFEI\ORADATA\XIFENFEI\SYSTEM01.DBF'
idle> alter system set "_allow_resetlogs_corruption"=true scope=spfile;
系统已更改。
idle> shutdown immediate
ORA-01109: 数据库未打开
已经卸载数据库。
ORACLE 例程已经关闭。
idle> startup mount;
ORACLE 例程已经启动。
Total System Global Area  400846848 bytes
Fixed Size                  2440024 bytes
Variable Size             289408168 bytes
Database Buffers          100663296 bytes
Redo Buffers                8335360 bytes
数据库装载完毕。
idle> recover database until cancel;
ORA-00279: 更改 2821739 (在 03/30/2014 20:58:39 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\XIFENFEI\FAST_RECOVERY_AREA\XIFENFEI\ARCHIVELOG\2014_03_30\O1_MF_1_36_%U_
.ARC
ORA-00280: 更改 2821739 (用于线程 1) 在序列 #36 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: 警告: RECOVER 成功但 OPEN RESETLOGS 将出现如下错误
ORA-01194: 文件 1 需要更多的恢复来保持一致性
ORA-01110: 数据文件 1: 'E:\APP\XIFENFEI\ORADATA\XIFENFEI\SYSTEM01.DBF'
ORA-01112: 未启动介质恢复
idle> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [2662], [0], [2821746], [0],
[2822649], [4194545], [], [], [], [], [], []
进程 ID: 6880
会话 ID: 242 序列号: 3
idle> startup mount;
SP2-0642: SQL*Plus 内部错误状态 2133, 上下文 3114:0:0
继续执行将不安全
ORA-03114: 未连接到 ORACLE
--使用bbed解决dul问题,出现新问题
idle> alter database open;
alter database open
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [4194], [61], [4], [], [], [], [],
[], [], [], [], []
进程 ID: 5932
会话 ID: 242 序列号: 3
idle> ALTER SYSTEM SET UNDO_MANAGEMENT='MANAUL' SCOPE=SPFILE;
系统已更改。
idle> SHUTDOWN IMMEDIATE;
ORA-01109: 数据库未打开
已经卸载数据库。
ORACLE 例程已经关闭。
idle> startup mount;
ORACLE 例程已经启动。
Total System Global Area  400846848 bytes
Fixed Size                  2440024 bytes
Variable Size             281019560 bytes
Database Buffers          109051904 bytes
Redo Buffers                8335360 bytes
数据库装载完毕。
idle> alter database open;
数据库已更改。
[/sql
<strong>open pdb</strong>

idle> alter session set container=pdb1;
会话已更改。
idle> alter database open;
数据库已更改。
idle> alter session set container=pdb2;
会话已更改。
idle> alter database open;
数据库已更改。
idle> alter session set container=pdb1;
会话已更改。

ORACLE 12C的bug估计不会太少,急于上12C的朋友,强烈建议打上最先patch,少踩一些雷

ORACLE 12C redo异常恢复测试—部分pdb未正常open

为了验证当前redo丢失的情况下ORACLE 12C CDB数据库恢复的情况,做了一个小实验,三个会话,分别操作为在root pdb中执行一个delete 不提交;另外一个会话在user pdb中delete记录不提交;最后一个会话中直接abort数据库,然后进行数据库恢复,验证数据库是否可以都正常open(所有pdb)
会话1(root pdb中操作)

CDB_CDB$ROOT@SYS> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            READ WRITE NO
CDB_CDB$ROOT@SYS> show con_name
CON_NAME
------------------------------
CDB$ROOT
CDB_CDB$ROOT@SYS> create table t_xifenfei as select * from dba_objects;
表已创建。
CDB_CDB$ROOT@SYS> delete from t_xifenfei;
已删除 91032 行。
CDB_CDB$ROOT@SYS>

会话2(user pdb中操作)

CDB_CDB$ROOT@SYS> show con_name
CON_NAME
------------------------------
PDB
CDB_CDB$ROOT@SYS> drop table t_xifenfei purge;
表已删除。
CDB_CDB$ROOT@SYS> create table t_xifenfei as select * from dba_objects;
表已创建。
CDB_CDB$ROOT@SYS> delete from t_xifenfei;
已删除 91144 行。

会话3(直接abort数据库)

CDB_CDB$ROOT@SYS> shutdown abort;
ORACLE 例程已经关闭。

删除数据库联机日志

E:\app\XIFENFEI\oradata\cdb>dir redo*.log
 驱动器 E 中的卷是 本地磁盘
 卷的序列号是 000C-3B41
 E:\app\XIFENFEI\oradata\cdb 的目录
2013-08-07  01:41        52,429,312 REDO01.LOG
2013-08-07  01:40        52,429,312 REDO02.LOG
2013-08-07  01:40        52,429,312 REDO03.LOG
2014-03-20  22:47        52,432,896 REDO04.LOG
2014-03-20  22:47        52,432,896 REDO05.LOG
2014-03-20  22:47        52,432,896 REDO06.LOG
               6 个文件    314,586,624 字节
               0 个目录 21,359,374,336 可用字节
E:\app\XIFENFEI\oradata\cdb>del redo*.log
E:\app\XIFENFEI\oradata\cdb>dir redo*.log
 驱动器 E 中的卷是 本地磁盘
 卷的序列号是 000C-3B41
 E:\app\XIFENFEI\oradata\cdb 的目录
找不到文件

启动数据库报ORA-00313错误

idle> startup
ORACLE 例程已经启动。
Total System Global Area  521936896 bytes
Fixed Size                  2404552 bytes
Variable Size             205524792 bytes
Database Buffers          306184192 bytes
Redo Buffers                7823360 bytes
数据库装载完毕。
ORA-00313: 无法打开日志组 6 (用于线程 1) 的成员
ORA-00312: 联机日志 6 线程 1: 'E:\APP\XIFENFEI\ORADATA\CDB\REDO06.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件。

使用_allow_resetlogs_corruption参数尝试open数据库

idle> shutdown abort
ORACLE 例程已经关闭。
idle> startup pfile='d:/pfile.txt' mount
ORACLE 例程已经启动。
Total System Global Area  521936896 bytes
Fixed Size                  2404552 bytes
Variable Size             205524792 bytes
Database Buffers          306184192 bytes
Redo Buffers                7823360 bytes
数据库装载完毕。
idle> show parameter resetlogs;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
_allow_resetlogs_corruption          boolean     TRUE
idle> recover database until cancel;
ORA-00279: 更改 12696935641735 (在 03/20/2014 22:38:52 生成) 对于线程 1
是必需的
ORA-00289: 建议:
E:\APP\XIFENFEI\FAST_RECOVERY_AREA\CDB\ARCHIVELOG\2014_03_20\O1_MF_1_872_9LOZSK9
X_.ARC
ORA-00280: 更改 12696935641735 (用于线程 1) 在序列 #872 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: 警告: RECOVER 成功但 OPEN RESETLOGS 将出现如下错误
ORA-01194: 文件 1 需要更多的恢复来保持一致性
ORA-01110: 数据文件 1: 'E:\APP\XIFENFEI\ORADATA\CDB\SYSTEM01.DBF'
ORA-01112: 未启动介质恢复
idle> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [2956], [1012314778],
[2956], [1012314903], [268435600], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [2662], [2956], [1012314777],
[2956], [1012314903], [268435600], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [2956], [1012314768],
[2956], [1012314903], [268435600], [], [], [], [], [], []
进程 ID: 9268
会话 ID: 92 序列号: 3

ora-600[2662]很熟悉,文件头的scn小于文件中某个block的csn,通过bbed等工具修改文件scn,尝试启动数据库

CDB_CDB$ROOT@SYS> SHUTDOWN ABORT
ORACLE 例程已经关闭。
CDB_CDB$ROOT@SYS> STARTUP
ORACLE 例程已经启动。
Total System Global Area  521936896 bytes
Fixed Size                  2404552 bytes
Variable Size             205524792 bytes
Database Buffers          306184192 bytes
Redo Buffers                7823360 bytes
数据库装载完毕。
数据库已经打开。

尝试open user pdb

CDB_CDB$ROOT@SYS> alter session set container=pdb;
会话已更改。
CDB_CDB$ROOT@SYS> alter database open;
alter database open
*
第 1 行出现错误:
ORA-00600: 内部错误代码, 参数: [kcvfdb_pdb_set_clean_scn: cleanckpt], [3],
[2956], [1012312995], [2956], [1012334778], [2], [], [], [], [], []

查询mos得到结论:
Bug 16784143 ORA-600 [kcvfdb_pdb_set_clean_scn: cleanckpt] with PDBs
在12.2/12.1.0.2/12.1.0.1.1中修复,后续尝试打patch,来进一步恢复.
1


测试结果证明
1.root pdb的open过程和以前版本数据库无大差异,但是scn推进比较费劲,以前的event和隐含参数方法无效
2.user pdb再是看是因为bug无法open,后续继续验证
3.12.1的base 版本可能确实问题很多,生产使用慎重