含is null sql语句优化

原sql语句与执行计划

SQL> set autot trace
SQL> WITH AL AS (SELECT * FROM XIFENFEI_LOG WHERE CLEAR_TIME IS NULL)
   2 SELECT SWP.ID SWP_ID, AL.* FROM AL FULL OUTER JOIN XIFENFEI_LOG_SWAP SWP ON SWP.ID = AL.ID;
54 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 888046630
----------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                             |    24 | 11064 | 24658   (2)| 00:04:56 |
|   1 |  TEMP TABLE TRANSFORMATION |                             |       |       |            |     |
|   2 |   LOAD AS SELECT           |                             |       |       |            |     |
|*  3 |    TABLE ACCESS FULL       | XIFENFEI_LOG                |    23 |  2576 | 24652   (2)| 00:04:56 |
|   4 |   VIEW                     |                             |    24 | 11064 |     6  (17)| 00:00:01 |
|   5 |    UNION-ALL               |                             |       |       |            |     |
|   6 |     NESTED LOOPS OUTER     |                             |    23 | 10465 |     2   (0)| 00:00:01 |
|   7 |      VIEW                  |                             |    23 | 10304 |     2   (0)| 00:00:01 |
|   8 |       TABLE ACCESS FULL    | SYS_TEMP_0FD9D6605_51B4E691 |    23 |  2576 |     2   (0)| 00:00:01 |
|*  9 |      INDEX UNIQUE SCAN     | XIFENFEI_LOG_SWP_PK         |     1 |     7 |     0   (0)| 00:00:01 |
|* 10 |     HASH JOIN ANTI         |                             |     1 |    20 |     4  (25)| 00:00:01 |
|  11 |      INDEX FULL SCAN       | XIFENFEI_LOG_SWP_PK         |    20 |   140 |     1   (0)| 00:00:01 |
|  12 |      VIEW                  |                             |    23 |   299 |     2   (0)| 00:00:01 |
|  13 |       TABLE ACCESS FULL    | SYS_TEMP_0FD9D6605_51B4E691 |    23 |  2576 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("CLEAR_TIME" IS NULL)
   9 - access("SWP"."ID"(+)="AL"."ID")
  10 - access("SWP"."ID"="AL"."ID")
Statistics
----------------------------------------------------------
          2  recursive calls
          8  db block gets
     111504  consistent gets
          1  physical reads
        692  redo size
       8075  bytes sent via SQL*Net to client
        502  bytes received via SQL*Net from client
          5  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         54  rows processed

这里很明显占用资源多,执行时间长的都在XIFENFEI_LOG表的全表扫描上,而该表的where 条件是CLEAR_TIME is null.

分析CLEAR_TIME 列null值的分布

SQL> SELECT count(*) FROM XIFENFEI_LOG WHERE CLEAR_TIME IS NULL;
  COUNT(*)
----------
        48
SQL> SELECT count(*) FROM XIFENFEI_LOG WHERE CLEAR_TIME IS not NULL;
  COUNT(*)
----------
   6899211

通过这里分析可以知道,CLEAR_TIME is null的值非常少,如果能够创建一个index,取到CLEAR_TIME 列null的值,那效率将非常搞.但是有oracle index知识的人都知道,B树index是不包含null列,因此一般性index无法满足该需求.这里思考创建含常数的复合index,而且把CLEAR_TIME放在前面,因为后面的常数一定存在,因此CLEAR_TIME中含有null的记录也就包含在该复合index中.

创建含常数复合index

SQL> create index ind_XIFENFEI_LOG_null on XIFENFEI_LOG (CLEAR_TIME,0) online;
Index created.

再次查看执行计划

SQL> WITH AL AS (SELECT * FROM XIFENFEI_LOG WHERE CLEAR_TIME IS NULL)
  2  SELECT SWP.ID SWP_ID, AL.* FROM AL FULL OUTER JOIN XIFENFEI_LOG_SWAP SWP ON SWP.ID = AL.ID;
50 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2359331571
-------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                             |    24 | 11064 |    25   (4)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION    |                             |       |       |            |          |
|   2 |   LOAD AS SELECT              |                             |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| XIFENFEI_LOG                |    23 |  2576 |    19   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | IND_XIFENFEI_LOG_NULL       |    23 |       |     3   (0)| 00:00:01 |
|   5 |   VIEW                        |                             |    24 | 11064 |     6  (17)| 00:00:01 |
|   6 |    UNION-ALL                  |                             |       |       |            |          |
|   7 |     NESTED LOOPS OUTER        |                             |    23 | 10465 |     2   (0)| 00:00:01 |
|   8 |      VIEW                     |                             |    23 | 10304 |     2   (0)| 00:00:01 |
|   9 |       TABLE ACCESS FULL       | SYS_TEMP_0FD9D660D_51B4E691 |    23 |  2576 |     2   (0)| 00:00:01 |
|* 10 |      INDEX UNIQUE SCAN        | XIFENFEI_LOG_SWP_PK         |     1 |     7 |     0   (0)| 00:00:01 |
|* 11 |     HASH JOIN ANTI            |                             |     1 |    20 |     4  (25)| 00:00:01 |
|  12 |      INDEX FULL SCAN          | XIFENFEI_LOG_SWP_PK         |    20 |   140 |     1   (0)| 00:00:01 |
|  13 |      VIEW                     |                             |    23 |   299 |     2   (0)| 00:00:01 |
|  14 |       TABLE ACCESS FULL       | SYS_TEMP_0FD9D660D_51B4E691 |    23 |  2576 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("CLEAR_TIME" IS NULL)
  10 - access("SWP"."ID"(+)="AL"."ID")
  11 - access("SWP"."ID"="AL"."ID")
Statistics
----------------------------------------------------------
          2  recursive calls
          8  db block gets
         33  consistent gets
          1  physical reads
        648  redo size
       7688  bytes sent via SQL*Net to client
        502  bytes received via SQL*Net from client
          5  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         50  rows processed

这里可以发现,该sql使用了创建的含常数的复合index,sql执行时间从4分56秒,提高到现在的1秒钟,逻辑读从当初的111504减小到现在的33,巧用含常数的复合索引使得sql执行效率极大提高.

ORACLE_HOME不一致导致实例无法通过本地认证登录

遇到网友咨询,类似故障重现:Liunx平台ORACLE数据库运行正常,ORACLE_SID正确,但是本地登录提示idle instance,tns方式可以正常登录现象

[oracle@xifenfei ~]$ ps -ef|grep pmon
oracle   26295     1  0 04:11 ?        00:00:01 ora_pmon_XFF
oracle   27997 27966  0 05:48 pts/0    00:00:00 grep pmon
[oracle@xifenfei ~]$ env|grep ORA
ORACLE_SID=XFF
ORACLE_BASE=/u01/oracle
ORACLE_HOME=/u01/oracle/oracle/product/10.2.0/db_1
[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Sat Apr 20 05:48:30 2013
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to an idle instance.
SQL>
[oracle@xifenfei ~]$ sqlplus sys/xifenfei@XFF as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Sat Apr 20 05:54:49 2013
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL>

对于这样的现象,通过试验重现,并且通过oradebug ipc进行说明。
补充知识点:oracle本地认证是通过ipc进行的,而ipc是直接访问共享内存段的

系统当前状态
系统未启动然后数据库情况

[oracle@xifenfei ~]$ ipcs -m
------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
[oracle@xifenfei ~]$ ps -ef|grep pmon|grep -v grep
[oracle@xifenfei ~]$

启动数据库
确定环境变量ORACLE_SID,ORACLE_HOME

[oracle@xifenfei ~]$ env|grep ORA
ORACLE_SID=XFF
ORACLE_BASE=/u01/oracle
ORACLE_HOME=/u01/oracle/oracle/product/10.2.0/db_1
[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Sat Apr 20 04:10:22 2013
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area  260046848 bytes
Fixed Size                  1266896 bytes
Variable Size              83888944 bytes
Database Buffers          167772160 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
--做debug ipc
SQL> oradebug setmypid
Statement processed.
SQL> oradebug ipc
Information written to trace file.
SQL> oradebug tracefile_name
/u01/oracle/admin/XFF/udump/xff_ora_26852.trc
*** SESSION ID:(152.25) 2013-04-20 04:43:00.983
Dump of unix-generic skgm context
areaflags            000000e7
realmflags           0000000f
mapsize              00000800
protectsize          00001000
lcmsize              00001000
seglen               00200000
largestsize  00000000ffffffff
smallestsize 0000000000400000
stacklimit         0xbdb87e6c
stackdir                   -1
mode                      640
magic                acc01ade
Handle:             0xe781de0 `/u01/oracle/oracle/product/10.2.0/db_1XFF' --->注意($ORACLE_HOME$ORACLE_SID)
Dump of unix-generic realm handle `/u01/oracle/oracle/product/10.2.0/db_1XFF', flags = 00000000
 Area #0 `Fixed Size' containing Subareas 0-0
  Total size 00000000001354d0 Minimum Subarea size 00000000
   Area  Subarea    Shmid      Stable Addr      Actual Addr   --->主要Shmid
      0        0  3080192 0x00000020000000 0x00000020000000
                              Subarea size     Segment size
                          0000000000136000 000000000fa00000
 Area #1 `Variable Size' containing Subareas 2-2
  Total size 000000000f000000 Minimum Subarea size 00400000
   Area  Subarea    Shmid      Stable Addr      Actual Addr
      1        2  3080192 0x00000020800000 0x00000020800000
                              Subarea size     Segment size
                          000000000f000000 000000000fa00000
 Area #2 `Redo Buffers' containing Subareas 1-1
  Total size 00000000006ca000 Minimum Subarea size 00000000
   Area  Subarea    Shmid      Stable Addr      Actual Addr
   Area  Subarea    Shmid      Stable Addr      Actual Addr
[oracle@xifenfei ~]$ ipcs -m
------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x6fd58208 3080192    oracle    640        262144000  17

这里证明ipc的处理是通过$ORACLE_HOME$ORACLE_SID进行的,修改ORACLE_HOME,进一步验证

修改ORACLE_HOME

[oracle@xifenfei ~]$ export ORACLE_HOME=/u01/oracle/oracle/product/10.2.0/db_1/
[oracle@xifenfei ~]$ env|grep ORA
ORACLE_SID=XFF
ORACLE_BASE=/u01/oracle
ORACLE_HOME=/u01/oracle/oracle/product/10.2.0/db_1/
[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Sat Apr 20 04:11:46 2013
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area  260046848 bytes
Fixed Size                  1266896 bytes
Variable Size              83888944 bytes
Database Buffers          167772160 bytes
Redo Buffers                7118848 bytes
ORA-01102: cannot mount database in EXCLUSIVE mode
[oracle@xifenfei ~]$ ipcs -m
------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x6fd58208 3080192    oracle    640        262144000  18
0x079d1b38 3112964    oracle    640        262144000  13
--启动两个同样的sid实例
[oracle@xifenfei ~]$ ps -ef|grep pmon|grep -v grep
oracle   26211     1  0 04:10 ?        00:00:00 ora_pmon_XFF
oracle   26295     1  0 04:11 ?        00:00:00 ora_pmon_XFF
SQL> oradebug setmypid
Statement processed.
SQL> oradebug ipc
Information written to trace file.
SQL> oradebug tracefile_name
/u01/oracle/admin/XFF/udump/xff_ora_27708.trc
*** SESSION ID:(159.5) 2013-04-20 05:32:00.969
Dump of unix-generic skgm context
areaflags            000000e7
realmflags           0000000f
mapsize              00000800
protectsize          00001000
lcmsize              00001000
seglen               00200000
largestsize  00000000ffffffff
smallestsize 0000000000400000
stacklimit         0xbdb5979c
stackdir                   -1
mode                      640
magic                acc01ade
Handle:             0xd99ede0 `/u01/oracle/oracle/product/10.2.0/db_1/XFF'--->注意
Dump of unix-generic realm handle `/u01/oracle/oracle/product/10.2.0/db_1/XFF', flags = 00000000

进一步证明在linux/unix系统,oracle数据库的内存段是通过ORACLE_HOME和ORACLE_SID结合起来识别的,只要ORACLE_HOME或者ORACLE_SID不一样就不能通过IPC访问实例的内存段,也就不能登录数据库

undo异常总结和恢复思路

UNDO异常报错千奇百怪,针对本人遇到的比较常见的undo异常报错进行汇总,仅供参考,数据库恢复过程是千奇百怪的,不能照搬硬套.
ORA-00704/ORA-00376
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: ‘/u01/oracle/oradata/ora11g/undotbs01.dbf’
Error 704 happened during db open, shutting down database
USER (ospid: 17864): terminating the instance due to error 704
Instance terminated by USER, pid = 17864
ORA-1092 signalled during: alter database open…
opiodr aborting process unknown ospid (17864) as a result of ORA-1092

ORA-00600[4097]
Fri Aug 31 23:14:10 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Fri Aug 31 23:14:12 2012
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.

ORA-01595/ORA-00600[4194]
Fri Aug 31 23:14:14 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-01595: error freeing extent (2) of rollback segment (4))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], []

0RA-00600[4193]
Tue Feb 14 09:35:34 2012
Errors in file d:\oracle\product\10.2.0\admin\interlib\udump\interlib_ora_2824.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [4193], [2005], [2008], [], [], [], [], []

ORA-00600[kcfrbd_3]
Wed Dec 05 10:26:35 2012
SMON: enabling tx recovery
Wed Dec 05 10:26:35 2012
Database Characterset is ZHS16GBK
Wed Dec 05 10:26:35 2012
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_548.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [2], [2279045], [1], [2277120], [2277120], [], []
SMON: terminating instance due to error 474

ORA-00600[4137]
Fri Jul 6 18:00:40 2012
SMON: ignoring slave err,downgrading to serial rollback
Fri Jul 6 18:00:41 2012
Errors in file /usr/local/oracle/admin/techdb/bdump/techdb_smon_16636.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
ORACLE Instance techdb (pid = 8) – Error 600 encountered while recovering transaction (3, 17).

ORA-01595/ORA-01594
Sat May 12 21:54:17 2012
Errors in file /oracle/app/admin/prmdb/bdump/prmdb2_smon_483522.trc:
ORA-01595: error freeing extent (2) of rollback segment (19))
ORA-01594: attempt to wrap into rollback segment (19) extent (2) which is being freed

ORA-00704/ORA-01555
Fri May 4 21:04:21 2012
select ctime, mtime, stime from obj$ where obj# = :1
Fri May 4 21:04:21 2012
Errors in file /oracle/admin/standdb/udump/perfdb_ora_1286288.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 40 with name “_SYSSMU40$” too small
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1286288
ORA-1092 signalled during: alter database open resetlogs…

ORA-00607/ORA-00600[4194]
Block recovery completed at rba 3994.5.16, scn 0.89979533
Thu Jul 26 13:21:11 2012
Errors in file /orasvr/admin/mispdata/udump/mispdata_ora_2865.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [31], [2], [], [], [], [], []
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 2865
ORA-1092 signalled during: ALTER DATABASE OPEN…

ORA-00704/ORA-00600[4000]
Thu Feb 28 19:29:13 2013
Errors in file /u1/PROD/prodora/db/tech_st/10.2.0/admin/PROD_oracle/udump/prod_ora_20989.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [50], [], [], [], [], [], []
Thu Feb 28 19:29:13 2013
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 20989
ORA-1092 signalled during: ALTER DATABASE OPEN RESETLOGS…

undo异常恢复处理思路
除了极少数undo坏块,undo文件丢失外,大部分undo异常是因为redo未被正常进行前滚,从而导致undo回滚异常数据库无法open,解决此类问题,需要结合一般需要结合redo异常处理技巧在其中,一般undo异常处理思路
1.切换undo_management= MANUAL尝试启动数据库,如果不成功进入2
2.设置10513 等event尝试启动数据库,如果不成功进入3
3.使用_offline_rollback_segments/_corrupted_rollback_segments屏蔽回滚段
4.如果依然不能open数据库,考虑使用bbed工具提交事务,修改回滚段状态等操作
5.如果依然还不能open数据库,考虑使用dul

如果您按照上述步骤还不能解决,请联系我们,将为您提供专业数据库技术支持
Phone:17813235971    Q Q:107644445    E-Mail:dba@xifenfei.com

姊妹篇
ORACLE REDO各种异常恢复
ORACLE丢失各种文件导致数据库不能OPEN恢复

ORACLE 12C RAC主要gc等待事件

ORACLE 12C RAC中有很多GC等待事件,这里重点介绍一些常见的GC等待事件,以便在以后遇到类似问题方便分析

RAC等待事件

在本节中我将讨论重要的RAC等待事件。这还不是全部等待事件的完整列表,只是一个最常见的等待事件的列表。

GC当前块2-Way/3-Way

当前块的等待事件意味着被传输的块的版本是块的最新版本。这个等待事件读取和写入活动中都可能遇到。如果该块被访问是以读取活动进行,那么该资源上锁以KJUSERPR(PR)模式获得。刚才我在“资源和锁定”一节中所讨论的示例展示了KJUSERPR模式的锁。
在下面的例子中,我从表t_one中查询一行,造成连接到节点2的磁盘读。查看SQL跟踪文件,没有全局缓存等待事件。原因是该块被本地掌控(本地实例是master),所以FG进程中可直接获取该资源上锁,而不会产生任何全局缓存等待。这种类型的锁也被称为亲和力锁定(相似度锁定)方案。动态掌握资源(DRM,Dynamic Resource Mastering)的部分将详细讨论相似度锁定(亲和力锁定)。
RS@ORCL2:2> @tc_one_row.sql
N1 FNO BLOCK OBJ V1
———- ———- ———- ———- ———-
100 4 180 75742 250
Trace file:
nam=’db file sequential read’ ela= 563 file#=4 block#=180 blocks=1 obj#=75742
我连接到实例1(注释5)查询同一行数据。由于该块已经缓存在实例2,因此该块从实例2传输到实例1。跟踪显示,等待事件gc current block 2-way由于file_id=4,BLOCK_ID=180块而被遇到了。这个块传输是一个两路的块传输,因为该资源的拥有实例和资源主实例(实例2)是相同的。
SYS@ORCL1:1> @tc_one_row.sql
N1 FNO BLOCK OBJ V1
———- ———- ———- ———- ———-
100 4 180 75742 250
Trace file:
nam=’gc current block 2-way’ ela= 629 p1=4 p2=180 p3=1 obj#=75742 tim=1350440823837572
接下来,我将创建一个3路等待事件的条件,但首先让我在所有三个实例上刷新缓冲区,开始用干净的缓冲区。在以下示例中:
1.我将连接到实例1查询(这将加载块到实例1的缓冲区高速缓存中)这行数据。
2.我将连接到一个实例,并从实例3查询同一行。我会话的FG进程将为这一数据块请求运行在实例2上的LMS进程(因为实例2是该资源的掌控者)。
3.实例2的LMS进程将请求转发给实例1的LMS进程。
4.实例1的LMS进程将发送块给运行在实例3上的FG进程。
从本质上讲,三个实例参与一个数据块的传输,因此,这是一个三路块传输。
–alter system flush buffer_cache; –在所有实例上
RS@ORCL1:1> @tc_one_row.sql
N1 FNO BLOCK OBJ V1
———- ———- ———- ———- ———-
100 4 180 75742 250
RS@ORCL3:3> @tc_one_row.sql
N1 FNO BLOCK OBJ V1
———- ———- ———- ———- ———-
100 4 180 75742 250
Trace file:
nam=’gc current block 3-way’ ela= 798 p1=4 p2=180 p3=1 obj#=75742
连接到实例2查看资源锁,我们看到在该资源上有两个锁被持,分别是实例1和实例3(owner_node等于0和2)。
RS@ORCL2:2> SELECT resource_name1, grant_level, state, owner_node
FROM v$ges_enqueue
WHERE resource_name1 LIKE ‘[0xb4][0x4],[BL]%’;
RESOURCE_NAME1 GRANT_LEV STATE OWNER_NODE
—————————— ——— ————- ———-
[0xb4][0x4],[BL][ext 0x0,0x0] KJUSERPR GRANTED 2
[0xb4][0x4],[BL][ext 0x0,0x0] KJUSERPR GRANTED 0
在gc current block 2-way or gc current block 3-way等待事件上的过多等待,通常要么是由于(a)一种低效的执行计划,导致了大量的块访问,或者(b)应用数据相似度(应用亲和力)没有被实施。如果对象访问本地化,考虑实施应用亲和力(应用数据的相似度)。此外,使用前面一节中讨论的技术“所有等待事件的通用分析”。

GC CR Block 2-Way/3-Way

CR模式的块传输发生在只读访问的请求中。考虑这样一个场景:一个块以CURRENT模式驻留在实例2上,实例2以独占模式保持了该资源的BL锁。另一个会话连接到实例1来请求该块,阻止。由于Oracle数据库中“其他人看不到未提交的更改”,SELECT语句请求一个查询开始时间的块的特定版本。SCN被用于标识块版本,本质上,SELECT语句请求的版本与块的SCN一致。LMS进程维护实例2的请求,以CURRENT的模式克隆该块到缓冲区,验证SCN版本与请求是一致的,然后发送该块的CR拷贝给FG进程。
这些CR模式传输和CURRENT模式传输之间的主要区别在于,在CR模式传输的情况下,在GRD中没有资源或锁来维护CR缓冲区。从本质上讲,CR模式块不需要全局缓存资源或锁。接收到的CR副本只能由提出请求的会话使用,并且只适用于这个特定的SQL执行中。这就是为什么Oracle数据库不对CR传输获取BL资源的任何锁。
由于没有全局缓存锁保护缓冲区,连接到实例1再次执行这个SQL语句访问那个块时将遭遇gc cr block 2-way 或者 gc cr block 3-way等待事件。
因此,每次从实例1访问该块都将触发新CR缓冲区的构造。即使在实例2的缓冲区没有发生该块修改,在实例1中的FG进程仍然会遭遇到CR等待事件。驻留在实例1的CR缓冲区是不能重复使用的,因为每SQL执行请求查询时的SCN会有所不同。
下面的跟踪显示了一个块从资源主实例以0.6毫秒的延迟被传输到请求实例。另外,file_id,BLOCK_ID,和跟踪文件中的object_id信息,可以被用来识别正在遭遇这两个等待事件的对象。当然,也可以通过查询ASH数据来识别该对象。
nam=’gc cr block 2-way’ ela= 627 p1=7 p2=6852 p3=1 obj#=76483 tim=37221074057
执行tc_one_row.sql五次,然后查询缓冲区头信息后,你可以看到在两个实例1和2有五个该块的CR缓冲区。请注意,CR_SCN_BAS和CR_SCN_WRP 列(注释6)显示了每个CR缓冲副本不同的值。查询GV$ ges_resource和GV$ ges_enqueue,你也可以看到有没有GC锁保护这些缓冲区。
清单10-10 缓冲区状态

SELECT

DECODE(state,0,’free’,1,’xcur’,2,’scur’,3,’cr’, 4,’read’,5,’mrec’,

6,’irec’,7,’write’,8,’pi’, 9,’memory’,10,’mwrite’,

11,’donated’, 12,’protected’, 13,’securefile’, 14,’siop’,

15,’recckpt’, 16, ‘flashfree’, 17, ‘flashcur’, 18, ‘flashna’) state,

mode_held, le_addr, dbarfil, dbablk, cr_scn_bas, cr_scn_wrp , class

FROM sys.x$bh

WHERE obj= &&obj

AND dbablk= &&block

AND state!=0 ;

Enter value for obj: 75742

Enter value for block: 180

STATE MODE_HELD LE DBARFIL DBABLK CR_SCN_BAS CR_SCN_WRP CLASS

———- ———- — ————– ———- ———- ———- ———-

cr 0 00 1 75742 649314930 3015 1

cr 0 00 1 75742 648947873 3015 1

cr 0 00 1 75742 648926281 3015 1

cr 0 00 1 75742 648810300 3015 1

cr 0 00 1 75742 1177328436 3013 1
CR缓冲区的创建是一个特例,它并不需要获取全局缓存锁来保护CR缓冲区。如果频繁访问的对象上有长时间未提交的事物就可能会出现CR风暴。因此,明智的做法是把更新大量表数据的批处理安排在一个不太繁忙的时间段。

GC CR Grant 2-Way/Gc Current Grant 2-Way

如果被请求的块没有驻留在任何缓冲区中,就会遭遇gc cr grant 2-way 和 gc current grant 2-way等待事件。FG进程向LMS进程请求一个块,但块没有驻留在任何缓冲区。因此,LMS进程回复一条授权FG进程从磁盘读取的块的消息。FG进程从磁盘读取该块并继续后续的处理。
下面一行显示了为了访问file_id=4和lock_id=180数据块,FG进程接收到来自LMS进程的授权响应。下一行显示了,执行了一个从磁盘读取块的物理读。
nam=’gc cr grant 2-way’ ela= 402 p1=4 p2=180 p3=1 obj#=75742
nam=’db file sequential read’ ela= 553 file#=4 block#=180 blocks=1 obj#=75742
过多的此类等待意味着,要么缓冲区高速缓存太小,要么SQL语句的执行过分的刷新了缓冲区高速缓存。识别正在遭遇此类等待事件的SQL语句和对象,并优化这些SQL语句。
DRM功能的设计就是为了减少发生这类授权相关的等待事件。

GC CR Block Busy/GC Current Block Busy

繁忙事件(Busy events)表明,LMS执行了额外的工作去处理并发相关的问题。例如,要建立一个CR块,LMS进程可能要应用撤消记录(undo records),重构一个与查询的SCN一致的块。在把该块回传给FG过程时,LMS将标记块传输是否遇到gc cr block busy或gc current block busy等待事件,这取决于块传输的类型。

GC CR Block Congested/GC Current Block Congested

如果LMS进程在接收到请求后没有在1毫秒内处理该请求,那么LMS进程标记这个响应为:该块正遭遇拥堵相关的等待事件。堵塞相关的等待事件有很多原因,比如说,LMS进程被大量全局高速缓存的请求所淹没。LMS进程正遭遇CPU的调度延迟,LMS进程已经遇到了另一种资源耗尽(如内存)等。
通常情况下,LMS进程运行在实时CPU调度优先级,因此,CPU调度的延迟将是最小的。大量这类的等待此事件表明出现了全局缓存请求的突然飙升,且LMS进程无法快速处理这些请求。服务器内存匮乏也可能导致LMS进程的分页,影响全局缓存的性能。
您可以去检查为什么LMS进程不能够有效地处理请求。

ORA-00600[kcrf_resilver_log_1]异常恢复

朋友在win x64位上的ORACLE 11.2.0.1启动出现ORA-00600[kcrf_resilver_log_1],让我帮忙看看,通过分析主要是因为Unpblished Bug 9056657导致
数据库启动报错
数据库在open的时候报ORA-00600[kcrf_resilver_log_1]

SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-00600: 内部错误代码, 参数: [kcrf_resilver_log_1], [0x7FF61C56E30], [2], [],
[], [], [], [], [], [], [], []

alert日志报错

Sat Mar 01 18:40:44 2014
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 3 processes
Started redo scan
Errors in file f:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_6432.trc  (incident=61360):
ORA-00600: 内部错误代码, 参数: [kcrf_resilver_log_1], [0x7FF61C56E30], [2], [], [], [], [], [], [], [], [], []
Incident details in: f:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_61360\orcl_ora_6432_i61360.trc
Aborting crash recovery due to error 600
Errors in file f:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_6432.trc:
ORA-00600: 内部错误代码, 参数: [kcrf_resilver_log_1], [0x7FF61C56E30], [2], [], [], [], [], [], [], [], [], []
Errors in file f:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_6432.trc:
ORA-00600: 内部错误代码, 参数: [kcrf_resilver_log_1], [0x7FF61C56E30], [2], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: alter database open...

分先相关SCN
控制文件scn
1
控制文件中数据文件scn
2
数据文件头scn3
通过这里可以知道,数据文件头的scn,控制文件中关于数据文件的scn都表明数据库为正常关闭,且scn值为16574746,但是控制文件中记录数据库SCN的值为16551515,可以判断数据库因为某种原因导致控制文件中的部分scn记录异常.

处理方法
因为控制文件SCN异常,考虑直接重建控制文件或者using backup controlfile方式恢复

SQL> select group#,status,sequence# from v$log;
    GROUP# STATUS            SEQUENCE#
---------- ---------------- ----------
         1 CURRENT                1510
         3 ACTIVE                 1509
         2 ACTIVE                 1508
    GROUP# MEMBER
---------- --------------------------------------------------
         3 F:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG
         2 F:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02.LOG
         1 F:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
SQL> recover database using backup controlfile until cancel;
ORA-00279: 更改 16574746 (在 03/01/2014 13:10:11 生成) 对于线程 1 是必需的
ORA-00289: 建议: F:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2014_0
3_01\O1_MF_1_1510_%U_.ARC
ORA-00280: 更改 16574746 (用于线程 1) 在序列 #1510 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
F:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
已应用的日志。
完成介质恢复。
SQL> alter database open resetlogs;
数据库已更改。

在最近的同样的错误,但是没有如此的幸运具体参考:记录一次ORA-00600 [kcrf_resilver_log_1] 恢复过程

ORACLE REDO各种异常恢复

redo是oracle数据库比较核心文件,当redo异常之后,数据库无法正常启动,而且有丢失数据的风险,强烈建议条件允许redo多路镜像.redo文件异常的故障可以说是千奇百怪,但是总体上可以分为几类:
数据库归档/非归档模式下inactive redo异常
ORA-00316 ORA-00327

SQL> startup mount
ORACLE instance started.
Total System Global Area  260046848 bytes
Fixed Size                  1266896 bytes
Variable Size              83888944 bytes
Database Buffers          167772160 bytes
Redo Buffers                7118848 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00316: log 2 of thread 1, type  in header is not log file
ORA-00312: online log 2 thread 1: '/u01/oracle/oradata/XFF/redo02.log'
SQL> col member for a40
SQL> set lines 120
SQL>   SELECT thread#,
  2           a.sequence#,
  3           a.group#,
  4           TO_CHAR (first_change#, '9999999999999999') "SCN",
  5           a.status,
  6           MEMBER
  7      FROM v$log a, v$logfile b
  8     WHERE a.group# = B.GROUP#
  9  ORDER BY a.sequence# DESC;
   THREAD#  SEQUENCE#     GROUP# SCN               STATUS           MEMBER
---------- ---------- ---------- ----------------- ---------------- -----------------------------------
         1         15          3            665697 CURRENT          /u01/oracle/oradata/XFF/redo03.log
         1         14          2            645619 INACTIVE         /u01/oracle/oradata/XFF/redo02.log
         1         13          1            625540 INACTIVE         /u01/oracle/oradata/XFF/redo01.log
SQL> ALTER DATABASE CLEAR LOGFILE GROUP 2;
Database altered.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00327: log 2 of thread 1, physical size  less than needed
ORA-00312: online log 2 thread 1: '/u01/oracle/oradata/XFF/redo02.log'
SQL> alter database drop logfile group 2;
Database altered.
SQL> alter database open;
Database altered.
SQL> alter database add logfile group 2 ('/u01/oracle/oradata/XFF/redo02.log') size 50M reuse;
Database altered.

正常关闭数据库current redo异常
ORA-00316 ORA-01623

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00316: log 1 of thread 1, type  in header is not log file
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'
SQL>   SELECT thread#,
  2           a.sequence#,
  3           a.group#,
  4           TO_CHAR (first_change#, '9999999999999999') "SCN",
  5           a.status,
  6           MEMBER
  7      FROM v$log a, v$logfile b
  8     WHERE a.group# = B.GROUP#
  9  ORDER BY a.sequence# DESC;
   THREAD#  SEQUENCE#     GROUP# SCN               STATUS           MEMBER
---------- ---------- ---------- ----------------- ---------------- ----------------------------------
         1         16          1            685918 CURRENT          /u01/oracle/oradata/XFF/redo01.log
         1         15          3            665697 INACTIVE         /u01/oracle/oradata/XFF/redo03.log
         1          0          2                 0 UNUSED           /u01/oracle/oradata/XFF/redo02.log
SQL> ALTER DATABASE CLEAR LOGFILE GROUP 1;
ALTER DATABASE CLEAR LOGFILE GROUP 1
*
ERROR at line 1:
ORA-00316: log 1 of thread 1, type 0 in header is not log file
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'
SQL> ALTER DATABASE drop logfile group 1;
ALTER DATABASE drop logfile group 1
*
ERROR at line 1:
ORA-01623: log 1 is current log for instance XFF (thread 1) - cannot drop
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'
SQL> recover database until cancel;
Media recovery complete.
SQL> alter database open resetlogs;
Database altered.

数据库异常关闭current/active redo异常
ORA-00316 ORA-01624 ORA-01194

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00316: log 1 of thread 1, type 0 in header is not log file
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'
SQL>   SELECT thread#,
  2           a.sequence#,
  3           a.group#,
  4           TO_CHAR (first_change#, '9999999999999999') "SCN",
  5           a.status,
  6           MEMBER
  7      FROM v$log a, v$logfile b
  8     WHERE a.group# = B.GROUP#
  9  ORDER BY a.sequence# DESC;
   THREAD#  SEQUENCE#     GROUP# SCN               STATUS           MEMBER
---------- ---------- ---------- ----------------- ---------------- -----------------------------------
         1          8          2            686310 CURRENT          /u01/oracle/oradata/XFF/redo02.log
         1          7          1            686294 ACTIVE           /u01/oracle/oradata/XFF/redo01.log
         1          6          3            686289 INACTIVE         /u01/oracle/oradata/XFF/redo03.log
SQL> ALTER DATABASE CLEAR LOGFILE GROUP 1;
ALTER DATABASE CLEAR LOGFILE GROUP 1
*
ERROR at line 1:
ORA-01624: log 1 needed for crash recovery of instance XFF (thread 1)
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'
SQL> ALTER DATABASE drop logfile group 1;
ALTER DATABASE drop logfile group 1
*
ERROR at line 1:
ORA-01624: log 1 needed for crash recovery of instance XFF (thread 1)
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'
SQL> recover database until cancel
ORA-00279: change 686294 generated at 04/20/2013 01:37:16 needed for thread 1
ORA-00289: suggestion : /u01/oracle/oracle/product/10.2.0/db_1/dbs/arch1_7_813202529.dbf
ORA-00280: change 686294 for thread 1 is in sequence #7
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/oracle/oradata/XFF/redo01.log
ORA-00308: cannot open archived log '/u01/oracle/oradata/XFF/redo01.log'
ORA-27047: unable to read the header block of file
Additional information: 2
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/XFF/system01.dbf'
ORA-01112: media recovery not started
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/XFF/system01.dbf'
SQL> alter system set "_allow_resetlogs_corruption"=true scope=spfile;
System altered.
SQL> shutdown immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area  260046848 bytes
Fixed Size                  1266896 bytes
Variable Size              83888944 bytes
Database Buffers          167772160 bytes
Redo Buffers                7118848 bytes
Database mounted.
SQL> recover database until cancel
ORA-00279: change 686294 generated at 04/20/2013 01:37:16 needed for thread 1
ORA-00289: suggestion : /u01/oracle/oracle/product/10.2.0/db_1/dbs/arch1_7_813202529.dbf
ORA-00280: change 686294 for thread 1 is in sequence #7
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/XFF/system01.dbf'
ORA-01112: media recovery not started
SQL> alter database open resetlogs;
Database altered.

在这样的情况下,数据库异常关闭,current/active redo异常,通过使用隐含参数可能可以侥幸的恢复数据库,但是也可能导致数据丢失.这里因为是模拟情况,无业务所以在很多较为繁忙的业务系统中,使用隐含参数resetlogs过程中可能还会遇到如下很多常见的错误,进一步增加了恢复难度

current/active redo异常后附带其他错误
ORA-600[2662]

Wed Dec 07 13:02:49 2011
SMON: enabling cache recovery
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_3388.trc  (incident=216664):
ORA-00600: 内部错误代码, 参数: [2662], [2], [1153862134], [2], [1153864845], [12582921], [], []
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_216664\hzyl_ora_3388_i216664.trc
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_3388.trc:
ORA-00600: 内部错误代码, 参数: [2662], [2], [1153862134], [2], [1153864845], [12582921], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 3388): terminating the instance due to error 600

ORA-00600[4000]

Thu Feb 28 19:29:10 2013
SMON: enabling cache recovery
Thu Feb 28 19:29:11 2013
Errors in file /u1/PROD/prodora/db/tech_st/10.2.0/admin/PROD_oracle/udump/prod_ora_20989.trc:
ORA-00600: internal error code, arguments: [4000], [50], [], [], [], [], [], []
Thu Feb 28 19:29:13 2013
Incremental checkpoint up to RBA [0x1.3.0], current log tail at RBA [0x1.3.0]
Thu Feb 28 19:29:13 2013
Errors in file /u1/PROD/prodora/db/tech_st/10.2.0/admin/PROD_oracle/udump/prod_ora_20989.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [50], [], [], [], [], [], []

ORA-00704 ORA-00604 ORA-01555

Fri May  4 21:04:21 2012
select ctime, mtime, stime from obj$ where obj# = :1
Fri May  4 21:04:21 2012
Errors in file /oracle/admin/standdb/udump/perfdb_ora_1286288.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 40 with name "_SYSSMU40$" too small
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1286288
ORA-1092 signalled during: alter database open resetlogs...

current/active redo异常还可能报如下错误
redo文件损坏报错

Started redo scan
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_2960.trc  (incident=214262):
ORA-00353: 日志损坏接近块 12014 更改 9743799889 时间 12/05/2011 09:21:11
ORA-00312: 联机日志 3 线程 1: 'R:\ORADATA\HZYL\REDO03.LOG'
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_214262\hzyl_ora_2960_i214262.trc
Aborting crash recovery due to error 368
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_2960.trc:
ORA-00368: 重做日志块中的校验和错误
ORA-00353: 日志损坏接近块 12014 更改 9743799889 时间 12/05/2011 09:21:11
ORA-00312: 联机日志 3 线程 1: 'R:\ORADATA\HZYL\REDO03.LOG'
ORA-368 signalled during: ALTER DATABASE OPEN...

redo文件被其他实例占用报错

Wed May 16 17:03:11 2012
Started redo scan
Wed May 16 17:03:11 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2040024.trc:
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_2'
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_1'
ORA-305 signalled during: ALTER DATABASE OPEN...

存储整体异常

Mon Oct 17 09:35:09 2011
Errors in file /oracle/app/admin/orcl/bdump/orcl2_lgwr_348814.trc:
ORA-00340: IO error processing online log 4 of thread 2
ORA-00345: redo log write error block 6732 count 2
ORA-00312: online log 4 thread 2: '/dev/rredo21'
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 6: No such device or address
Additional information: -1
Additional information: 1024
Mon Oct 17 09:35:09 2011
LGWR: terminating instance due to error 340

存储IO异常

Fri Feb 21 08:44:42 2014
Thread 1 advanced to log sequence 591 (LGWR switch)
  Current log# 1 seq# 591 mem# 0: J:\ORADATA\ORCL\REDO01.LOG
Fri Feb 21 15:31:20 2014
Errors in file c:\oracle\product\10.2.0\admin\orcl\bdump\orcl_lgwr_10312.trc:
ORA-00316: log 1 of thread 1, type 286 in header is not log file
ORA-00312: online log 1 thread 1: 'J:\ORADATA\ORCL\REDO01.LOG'

使用_disable_logging参数

Sat May 14 23:16:49 2005
Errors in file d:\oracle\admin\rman\bdump\rman_arc0_736.trc:
ORA-16038: log 3 sequence# 72 cannot be archived
ORA-00354: corrupt redo log block header
ORA-00312: online log 3 thread 1: 'D:\ORACLE\ORADATA\RMAN\REDO03.LOG'

如果你在使用这些思路进行恢复遇到突发情况不能自行解决,请联系我们,将为您提供专业数据库技术支持
Phone:17813235971    Q Q:107644445    E-Mail:dba@xifenfei.com

姊妹篇
undo异常总结和恢复思路
ORACLE丢失各种文件导致数据库不能OPEN恢复

记录一次ORA-00316 ORA-00312 redo异常恢复

正常运行的数据库报突然报ORA-00316: log 1 of thread 1, type 286 in header is not log file,异常终止

Fri Feb 21 08:44:42 2014
Thread 1 advanced to log sequence 591 (LGWR switch)
  Current log# 1 seq# 591 mem# 0: J:\ORADATA\ORCL\REDO01.LOG
Fri Feb 21 15:31:20 2014
Errors in file c:\oracle\product\10.2.0\admin\orcl\bdump\orcl_lgwr_10312.trc:
ORA-00316: log 1 of thread 1, type 286 in header is not log file
ORA-00312: online log 1 thread 1: 'J:\ORADATA\ORCL\REDO01.LOG'
Fri Feb 21 15:31:20 2014
Errors in file c:\oracle\product\10.2.0\admin\orcl\bdump\orcl_lgwr_10312.trc:
ORA-00316: log 1 of thread 1, type 286 in header is not log file
ORA-00312: online log 1 thread 1: 'J:\ORADATA\ORCL\REDO01.LOG'
Fri Feb 21 15:31:20 2014
LGWR: terminating instance due to error 316
Fri Feb 21 15:31:20 2014
Errors in file c:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j001_11328.trc:
ORA-00316: log  of thread , type  in header is not log file
Fri Feb 21 15:31:20 2014
Errors in file c:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j000_14116.trc:
ORA-00316: 日志  (用于线程 ) 标头中的类型  不是日志文件
Fri Feb 21 15:31:20 2014
Errors in file c:\oracle\product\10.2.0\admin\orcl\bdump\orcl_dbw1_8964.trc:
ORA-00316: log  of thread , type  in header is not log file
Fri Feb 21 15:31:22 2014
Errors in file c:\oracle\product\10.2.0\admin\orcl\bdump\orcl_dbw0_10592.trc:
ORA-00316: log  of thread , type  in header is not log file
ORA-27041: unable to open file
OSD-04002: unable to open file
O/S-Error: (OS 2) 系统找不到指定的文件。
ORA-27041: unable to open file
OSD-04002: unable to open file
O/S-Error: (OS 2) 系统找不到指定的文件。
Fri Feb 21 15:31:41 2014
Errors in file c:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_11112.trc:
ORA-00316: log  of thread , type  in header is not log file
Fri Feb 21 15:31:41 2014
Instance terminated by LGWR, pid = 10312

数据库启动报ORA-00316错误

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00316: log 1 of thread 1, type 286 in header is not log file
ORA-00312: online log 1 thread 1: 'J:\ORADATA\ORCL\REDO01.LOG'

alert日志信息,报ORA-00316 ORA-00312

Sun Feb 23 13:54:08 2014
Started redo scan
Sun Feb 23 13:54:08 2014
Errors in file e:\oracle\product\10.2.0\admin\ora10g\udump\orcl_ora_5544.trc:
ORA-00316: log 1 of thread 1, type 286 in header is not log file
ORA-00312: online log 1 thread 1: 'J:\ORADATA\ORCL\REDO01.LOG'
Sun Feb 23 13:54:08 2014
Aborting crash recovery due to error 316
Sun Feb 23 13:54:08 2014
Errors in file e:\oracle\product\10.2.0\admin\ora10g\udump\orcl_ora_5544.trc:
ORA-00316: log 1 of thread 1, type 286 in header is not log file
ORA-00312: online log 1 thread 1: 'J:\ORADATA\ORCL\REDO01.LOG'
ORA-316 signalled during: alter database open...

通过dump redo header可以看出来redo header完全混乱了,里面很多数据文件内容在里面,初步估计系统或者硬件有问题(不稳定)导致该问题

LOG FILE #1:
  (name #3) J:\ORADATA\ORCL\REDO01.LOG
 Thread 1 redo log links: forward: 2 backward: 0
 siz: 0x7d000 seq: 0x0000024f hws: 0x1 bsz: 512 nab: 0xffffffff flg: 0xa dup: 1
 Archive links: fwrd: 0 back: 0 Prev scn: 0x0000.4f4a1951
 Low scn: 0x0000.4f4e5400 02/21/2014 08:44:42
 Next scn: 0xffff.ffffffff 01/01/1988 00:00:00
 FILE HEADER:
	Software vsn=4280360459=0xff211e0b, Compatibility Vsn=103886850=0x6313002
	Db ID=842019892=0x32303434, Db Name='01a0001'
	Activation ID=1107558657=0x42040101
	Control Seq=842019892=0x32303434, File size=808464688=0x30303130
	File Number=2417, Blksiz=2013738803, File Type=286 UNKNOWN
 descrip:"00-440201|广东省xxxxxx研究院|00014402010037"
 thread: 767 nab: 0x534e4906 seq: 0xbcfebcbd hws: 0xffffffff eot: 55 dis: 66
 reset logs count: 0x7545245 scn: 0x0101.1e097178
 Low scn: 0x4537.2022002c 01/15/2022 20:59:37
 Next scn: 0x3734.46463135 04/30/2017 17:34:49
 Enabled scn: 0x4345.37333038 10/22/2015 02:04:16
 Thread closed scn: 0x3939.43303143 08/25/2024 00:11:31
 Log format vsn: 0x46343835 Disk cksum: 0x1d09 Calc cksum: 0x70ed
 Terminal Recovery Stop scn: 0x3734.35304141
 Terminal Recovery Stamp  03/17/2014 19:59:48
 Most recent redo scn: 0x3636.31303134
 Largest LWN: 758788710 blocks
 Miscellaneous flags: 0x41444534
 Thread internal enable indicator: thr: 263902399, seq: 808464496 scn: 0x3032.30343431

因为当前redo完全损坏,尝试不完全恢复并结合隐含参数(_allow_resetlogs_corruption)拉库,出现错误ORA-00704 ORA-00604 ORA-01555

Sun Feb 23 14:03:37 2014
SMON: enabling cache recovery
Sun Feb 23 14:03:39 2014
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0000.4f4e5405):
Sun Feb 23 14:03:39 2014
select ctime, mtime, stime from obj$ where obj# = :1
Sun Feb 23 14:03:39 2014
Errors in file e:\oracle\product\10.2.0\admin\ora10g\udump\orcl_ora_5504.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 11 with name "_SYSSMU11$" too small
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704

通过修改scn,让数据库顺利open

ORACLE丢失各种文件导致数据库不能OPEN恢复

在ORACLE的运行过程中,总会遇到这样那样的故障,本篇主要大概介绍关于因硬件,系统,误删除等各种原因导致数据库的部分文件丢失,这里列出来由于文件丢失而出现的常见错误和基本处理思路

1.丢失数据文件(ORA-01157)
SQL> startup
ORACLE instance started.

Total System Global Area 260046848 bytes
Fixed Size 1266896 bytes
Variable Size 83888944 bytes
Database Buffers 167772160 bytes
Redo Buffers 7118848 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 4 – see DBWR trace file
ORA-01110: data file 4: ‘/u01/oracle/oradata/XFF/users01.dbf’
数据文件丢失,处理方法:
1).使用备份还原丢失数据然后
2).非undo,system可以offline 掉该文件继续打开数据库
3).如果是undo需要谨慎,可能导致ORA-00376错误
4).如果是system offline可能导致ORA-01147

2. 丢失redo(ORA-00313)
SQL> startup
ORACLE instance started.

Total System Global Area 260046848 bytes
Fixed Size 1266896 bytes
Variable Size 83888944 bytes
Database Buffers 167772160 bytes
Redo Buffers 7118848 bytes
Database mounted.
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: ‘/u01/oracle/oradata/XFF/redo01.log’
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
redo文件丢失,处理步骤:
1).查询v$log确认该redo是否是current或者active
2).确定该redo是否被归档
3).如果是inactive使用clear 或者 clear unarchived
4).如果是active或者current,需要通过不完全恢复,甚至隐含参数等方法解决

3. 丢失undo(ORA-01092 ORA-00376)
SQL> startup
ORACLE instance started.

Total System Global Area 260046848 bytes
Fixed Size 1266896 bytes
Variable Size 83888944 bytes
Database Buffers 167772160 bytes
Redo Buffers 7118848 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 2 – see DBWR trace file
ORA-01110: data file 2: ‘/u01/oracle/oradata/XFF/undotbs01.dbf’

SQL> alter database datafile 2 offline drop;

Database altered.

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced

ORA-01092是前台错误,通过查询alert日志发现后台错误主要是:
Fri Oct 25 08:16:36 2013
Errors in file /u01/oracle/admin/XFF/bdump/xff_smon_7437.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 2 cannot be read at this time
ORA-01110: data file 2: ‘/u01/oracle/oradata/XFF/undotbs01.dbf’
因为undo文件丢失,有事务无法正常回滚,从而出现该错误,需要通过使用隐含参数屏蔽事务来解决

4. 丢失system(ORA-01147)
SQL> startup
ORACLE instance started.

Total System Global Area 260046848 bytes
Fixed Size 1266896 bytes
Variable Size 83888944 bytes
Database Buffers 167772160 bytes
Redo Buffers 7118848 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 1 – see DBWR trace file
ORA-01110: data file 1: ‘/u01/oracle/oradata/XFF/system01.dbf’

SQL> alter database datafile 1 offline drop;

Database altered.

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01147: SYSTEM tablespace file 1 is offline
ORA-01110: data file 1: ‘/u01/oracle/oradata/XFF/system01.dbf’
system表空间是系统表空间,该表空间中的数据文件不能被offline,如果该表空间数据文件丢失,数据库无法正常方法,可以考虑使用bbed模拟system文件欺骗数据库(非file# 1)或者使用dul抽取数据

5. 丢失控制文件(ORA-00205 ORA-00202)
SQL> startup
ORACLE instance started.

Total System Global Area 260046848 bytes
Fixed Size 1266896 bytes
Variable Size 83888944 bytes
Database Buffers 167772160 bytes
Redo Buffers 7118848 bytes
ORA-00205: error in identifying control file, check alert log for more info

ORA-00205是前台错误,具体需要结合日志分析:
Fri Oct 25 08:35:40 2013
ALTER DATABASE MOUNT
Fri Oct 25 08:35:40 2013
ORA-00202: control file: ‘/u01/oracle/oradata/XFF/control01.ctl’
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
这里可以看出来,是因为控制问文件丢失该值该错误,处理办法:
1).使用备份控制文件还原
2).查找是否还有其他控制文件,拷贝一份
3).列举数据文件重建控制文件

如果你在使用这些思路进行恢复遇到突发情况不能自行解决,请联系我们,将为您提供专业数据库技术支持
Phone:17813235971    Q Q:107644445    E-Mail:dba@xifenfei.com

姊妹篇
undo异常总结和恢复思路
ORACLE REDO各种异常恢复

DUL10直接支持ORACLE 8.0

在以前的文章中,写过DUL挖ORACLE 8.0数据库,使用的是dul 8的版本,现在测试dul 10直接支持ORACLE 8.0数据库
数据库版本 ORACLE 8

SVRMGR> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle8 Release 8.0.5.0.0 - Production
PL/SQL Release 8.0.5.0.0 - Production
CORE Version 4.0.5.0.0 - Production
TNS for 32-bit Windows: Version 8.0.5.0.0 - Production
NLSRTL Version 3.3.2.0.0 - Production
5 rows selected.

dul版本 DUL 10

e:\dul10>dul
Data UnLoader: 10.2.0.5.26 - Internal Only - on Sat Feb 15 15:54:15 2014
with 64-bit io functions
Copyright (c) 1994 2014 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
Found db_id = 1207542366
Found db_name = ORCL

DUL读取数据文件

DUL> show datafiles;
ts# rf# start   blocks offs open  err file name
UNK   1     0   102401    0    1    0 C:\Users\XIFENFEI\Desktop\temp\SYS1ORCL.ORA

DUL10参数配置

DUL> show parameter;
_SLPE_DEBUG               = FALSE
ALLOW_CHECKSUM_MISMATCH   = TRUE
ALLOW_DBA_MISMATCH        = TRUE
ALLOW_OTHER_OBJNO         = TRUE
ALLOW_TRAILER_MISMATCH    = TRUE
ASM_DO_HARD_CHECKS        = TRUE
AUTO_UPDATE_CHECKSUM      = TRUE
AUTO_UPDATE_TRAILER       = TRUE
BUFFER                    = 10485760
CF_FILES                  = 1022
CF_TABLESPACES            = 64
COMPATIBLE                = 8
CONTROL_FILE              = control.txt
DB_BLOCK_SIZE             = 2048
DB_NAME                   =
DB_ID                     = 0
DC_COLUMNS                = 100000
DC_OBJECTS                = 128k
DC_TABLES                 = 10000
DC_USERS                  = 1000
DC_SEGMENTS               = 10000
DC_EXTENTS                = 10000
DEFAULT_CHARACTER_SET     =
DEFAULT_NATIONAL_CHARACTER_SET =
EXPORT_MODE               = false
FEEDBACK                  = 1000
FILE                      =
FILE_SIZE_IN_MB           = 0
LDR_ENCLOSE_CHAR          = |
LDR_OUTPUT_IN_UTF8        = FALSE
LDR_PHYS_REC_SIZE         = 0
LOGFILE                   = dul.log
MAX_OPEN_FILES            = 8
OSD_MAX_THREADS           = 1055
OSD_BIG_ENDIAN_FLAG       = false
OSD_DBA_FILE_BITS         = 10
OSD_FILE_LEADER_SIZE      = 1
OSD_C_STRUCT_ALIGNMENT    = 32
OSD_WORD_SIZE             = 32
PARSE_HEX_ESCAPES         = FALSE
RESET_LOGFILE             = FALSE
SCAN_DATABASE_SCANS_LOB_SEGMENTS = TRUE
SCAN_STEP_SIZE            = 512
TRACE_FLAGS               = 0
UNEXP_MAX_ERRORS          = 1000
UNEXP_VERBOSE             = FALSE
USE_LOB_FILES             = TRUE
USE_SCANNED_EXTENT_MAP    = FALSE
VERIFY_NUMBER_PRECISION   = TRUE
WARN_RECREATE_FILES       = TRUE
WRITABLE_DATAFILES        = FALSE

DUL10加载ORACLE 8数据字典

DUL> bootstrap;
Probing file = 1, block = 527
  database version 8 bootstrap$ at file 1, block 352
. unloading table                BOOTSTRAP$
DUL: Warning: block number is non zero but marked deferred trying to process it anyhow
      52 rows unloaded
DUL: Warning: Dictionary cache DC_BOOTSTRAP is empty
Reading BOOTSTRAP.dat 52 entries loaded
Parsing Bootstrap$ contents
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 8
 OBJ$: segobjno 18, file 1 block 167
 TAB$: segobjno 2, tabno 1, file 1  block 52
 COL$: segobjno 2, tabno 5, file 1  block 52
 USER$: segobjno 10, tabno 1, file 1  block 147
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$    3504 rows unloaded
. unloading table                      TAB$     434 rows unloaded
. unloading table                      COL$   16185 rows unloaded
. unloading table                     USER$      24 rows unloaded
Reading USER.dat 24 entries loaded
Reading OBJ.dat 3504 entries loaded and sorted 3504 entries
Reading TAB.dat 434 entries loaded
Reading COL.dat 16185 entries loaded and sorted 16185 entries
Reading BOOTSTRAP.dat 52 entries loaded
DUL: Error: No entry in OBJ$ for "TABCOMPART$" type = 2
DUL: Error: No base dict info for SYS.TABCOMPART$
DUL: Error: No entry in OBJ$ for "INDCOMPART$" type = 2
DUL: Error: No base dict info for SYS.INDCOMPART$
DUL: Error: No entry in OBJ$ for "TABSUBPART$" type = 2
DUL: Error: No base dict info for SYS.TABSUBPART$
DUL: Error: No entry in OBJ$ for "INDSUBPART$" type = 2
DUL: Error: No base dict info for SYS.INDSUBPART$
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 8
 OBJ$: segobjno 18, file 1 block 167
 TAB$: segobjno 2, tabno 1, file 1  block 52
 COL$: segobjno 2, tabno 5, file 1  block 52
 USER$: segobjno 10, tabno 1, file 1  block 147
 TABPART$: segobjno 180, file 1 block 1275
 INDPART$: segobjno 182, file 1 block 1285
 IND$: segobjno 2, tabno 3, file 1  block 52
 ICOL$: segobjno 2, tabno 4, file 1  block 52
 LOB$: segobjno 2, tabno 8, file 1  block 52
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$
DUL: Warning: Recreating file "OBJ.ctl"
    3504 rows unloaded
. unloading table                      TAB$
DUL: Warning: Recreating file "TAB.ctl"
     434 rows unloaded
. unloading table                      COL$
DUL: Warning: Recreating file "COL.ctl"
   16185 rows unloaded
. unloading table                     USER$
DUL: Warning: Recreating file "USER.ctl"
      24 rows unloaded
. unloading table                  TABPART$       0 rows unloaded
. unloading table                  INDPART$       0 rows unloaded
. unloading table                      IND$     525 rows unloaded
. unloading table                     ICOL$     899 rows unloaded
. unloading table                      LOB$      27 rows unloaded
Reading USER.dat 24 entries loaded
Reading OBJ.dat 3504 entries loaded and sorted 3504 entries
Reading TAB.dat 434 entries loaded
Reading COL.dat 16185 entries loaded and sorted 16185 entries
Reading TABPART.dat 0 entries loaded and sorted 0 entries
Reading INDPART.dat 0 entries loaded and sorted 0 entries
Reading IND.dat 525 entries loaded
Reading LOB.dat 27 entries loaded
Reading ICOL.dat 899 entries loaded
Reading BOOTSTRAP.dat 52 entries loaded

DUL 10 unload ORACLE 8 TABLE

DUL> unload table ts$;
. unloading table                       TS$       4 rows unloaded
DUL>

通过测试,证明DUL10可以完美支持ORACLE 8.0数据库,在以后的低版本数据使用dul unload过程中,可以直接使用最新版本,而不用去到处寻找老版本dul

ORACLE 8.0.5 ORA-01207故障恢复

朋友和我说,他的数据库ORACLE 8.0.5出现ORA-01207,进行了尝试恢复但是别未成功,让我协助其完成恢复
数据库版本

SVRMGR> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle8 Release 8.0.5.0.0 - Production
PL/SQL Release 8.0.5.0.0 - Production
CORE Version 4.0.5.0.0 - Production
TNS for 32-bit Windows: Version 8.0.5.0.0 - Production
NLSRTL Version 3.3.2.0.0 - Production
5 rows selected.

open数据库报ORA-01207错误

SVRMGR> alter database open;
alter database open
*
ORA-01122: database file 1 failed verification check
ORA-01110: data file 1: 'D:\ORANT\DATABASE\SYS1ORCL.ORA'
ORA-01207: file is more recent than controlfile - old controlfile

出现该错误的原因是因为控制文件里面的scn或者checkpoint_time>数据文件中的对应值,从而出现该错误,解决方法重建控制文件或者执行recover using backup controlfile 之类命令

重建控制文件,并open报ORA-600[4147]

SVRMGR> alter database backup controlfile to trace;
Statement processed.
SVRMGR> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SVRMGR> STARTUP NOMOUNT
ORACLE instance started.
Total System Global Area                         15077376 bytes
Fixed Size                                          49152 bytes
Variable Size                                    12906496 bytes
Database Buffers                                  2048000 bytes
Redo Buffers                                        73728 bytes
SVRMGR> CREATE CONTROLFILE REUSE DATABASE "ORCL" NORESETLOGS NOARCHIVELOG
     2>     MAXLOGFILES 32
     3>     MAXLOGMEMBERS 2
     4>     MAXDATAFILES 32
     5>     MAXINSTANCES 16
     6>     MAXLOGHISTORY 3260
     7> LOGFILE
     8>   GROUP 1 'D:\ORANT\DATABASE\LOG4ORCL.ORA'  SIZE 1M,
     9>   GROUP 2 'D:\ORANT\DATABASE\LOG3ORCL.ORA'  SIZE 1M,
    10>   GROUP 3 'D:\ORANT\DATABASE\LOG2ORCL.ORA'  SIZE 1M,
    11>   GROUP 4 'D:\ORANT\DATABASE\LOG1ORCL.ORA'  SIZE 1M
    12> DATAFILE
    13>   'D:\ORANT\DATABASE\SYS1ORCL.ORA',
    14>   'D:\ORANT\DATABASE\USR1ORCL.ORA',
    15>   'D:\ORANT\DATABASE\RBS1ORCL.ORA',
    16>   'D:\ORANT\DATABASE\TMP1ORCL.ORA'
    17> ;
Statement processed.
SVRMGR> recover database using backup controlfile;
ORA-00279: change 46960617 generated at 01/31/14 18:51:49 needed for thread 1
ORA-00289: suggestion : D:\ORANT\RDBMS80\ARC12900.1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
D:\ORANT\DATABASE\LOG3ORCL.ORA
Log applied.
Media recovery complete.
SVRMGR> alter database open;
alter database open
*
ORA-00600: internal error code, arguments: [4147], [16], [1], [], [], [], [], []

The ORA-600[4147] basically indicates some kind of corruption with the UNDO (rollback segment)block, most probably due to a lost write to the rollback segment.
ORA-600[4147]是因为回滚段坏块导致(具体是因为undoblock的scn无效),解决方法是用dul找出来回滚段,并屏蔽之

继续恢复报ORA-00600[3668]

SVRMGR> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SVRMGR> startup
ORACLE instance started.
Total System Global Area                         15077376 bytes
Fixed Size                                          49152 bytes
Variable Size                                    12906496 bytes
Database Buffers                                  2048000 bytes
Redo Buffers                                        73728 bytes
Database mounted.
ORA-00600: internal error code, arguments: [3668], [1], [2], [17232], [17232], [4], [], []

ORA-00600[3668]是因为在ORACLE 7.0到9.2的版本中The FIRST time an attempt has been made to start an instance after a CREATE CONTROLFILE command has been issued.
At least one data file needs MEDIA RECOVERY.在9.2.0.x及其以后版本报:ORA-1113: file needs media recovery.
通过重建控制文件,执行recover database,再open数据库恢复成功