ORACLE丢失各种文件导致数据库不能OPEN恢复

在ORACLE的运行过程中,总会遇到这样那样的故障,本篇主要大概介绍关于因硬件,系统,误删除等各种原因导致数据库的部分文件丢失,这里列出来由于文件丢失而出现的常见错误和基本处理思路

1.丢失数据文件(ORA-01157)
SQL> startup
ORACLE instance started.

Total System Global Area 260046848 bytes
Fixed Size 1266896 bytes
Variable Size 83888944 bytes
Database Buffers 167772160 bytes
Redo Buffers 7118848 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 4 – see DBWR trace file
ORA-01110: data file 4: ‘/u01/oracle/oradata/XFF/users01.dbf’
数据文件丢失,处理方法:
1).使用备份还原丢失数据然后
2).非undo,system可以offline 掉该文件继续打开数据库
3).如果是undo需要谨慎,可能导致ORA-00376错误
4).如果是system offline可能导致ORA-01147

2. 丢失redo(ORA-00313)
SQL> startup
ORACLE instance started.

Total System Global Area 260046848 bytes
Fixed Size 1266896 bytes
Variable Size 83888944 bytes
Database Buffers 167772160 bytes
Redo Buffers 7118848 bytes
Database mounted.
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: ‘/u01/oracle/oradata/XFF/redo01.log’
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
redo文件丢失,处理步骤:
1).查询v$log确认该redo是否是current或者active
2).确定该redo是否被归档
3).如果是inactive使用clear 或者 clear unarchived
4).如果是active或者current,需要通过不完全恢复,甚至隐含参数等方法解决

3. 丢失undo(ORA-01092 ORA-00376)
SQL> startup
ORACLE instance started.

Total System Global Area 260046848 bytes
Fixed Size 1266896 bytes
Variable Size 83888944 bytes
Database Buffers 167772160 bytes
Redo Buffers 7118848 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 2 – see DBWR trace file
ORA-01110: data file 2: ‘/u01/oracle/oradata/XFF/undotbs01.dbf’

SQL> alter database datafile 2 offline drop;

Database altered.

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced

ORA-01092是前台错误,通过查询alert日志发现后台错误主要是:
Fri Oct 25 08:16:36 2013
Errors in file /u01/oracle/admin/XFF/bdump/xff_smon_7437.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 2 cannot be read at this time
ORA-01110: data file 2: ‘/u01/oracle/oradata/XFF/undotbs01.dbf’
因为undo文件丢失,有事务无法正常回滚,从而出现该错误,需要通过使用隐含参数屏蔽事务来解决

4. 丢失system(ORA-01147)
SQL> startup
ORACLE instance started.

Total System Global Area 260046848 bytes
Fixed Size 1266896 bytes
Variable Size 83888944 bytes
Database Buffers 167772160 bytes
Redo Buffers 7118848 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 1 – see DBWR trace file
ORA-01110: data file 1: ‘/u01/oracle/oradata/XFF/system01.dbf’

SQL> alter database datafile 1 offline drop;

Database altered.

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01147: SYSTEM tablespace file 1 is offline
ORA-01110: data file 1: ‘/u01/oracle/oradata/XFF/system01.dbf’
system表空间是系统表空间,该表空间中的数据文件不能被offline,如果该表空间数据文件丢失,数据库无法正常方法,可以考虑使用bbed模拟system文件欺骗数据库(非file# 1)或者使用dul抽取数据

5. 丢失控制文件(ORA-00205 ORA-00202)
SQL> startup
ORACLE instance started.

Total System Global Area 260046848 bytes
Fixed Size 1266896 bytes
Variable Size 83888944 bytes
Database Buffers 167772160 bytes
Redo Buffers 7118848 bytes
ORA-00205: error in identifying control file, check alert log for more info

ORA-00205是前台错误,具体需要结合日志分析:
Fri Oct 25 08:35:40 2013
ALTER DATABASE MOUNT
Fri Oct 25 08:35:40 2013
ORA-00202: control file: ‘/u01/oracle/oradata/XFF/control01.ctl’
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
这里可以看出来,是因为控制问文件丢失该值该错误,处理办法:
1).使用备份控制文件还原
2).查找是否还有其他控制文件,拷贝一份
3).列举数据文件重建控制文件

如果你在使用这些思路进行恢复遇到突发情况不能自行解决,请联系我们,将为您提供专业数据库技术支持
Phone:17813235971    Q Q:107644445    E-Mail:dba@xifenfei.com

姊妹篇
undo异常总结和恢复思路
ORACLE REDO各种异常恢复

DUL10直接支持ORACLE 8.0

在以前的文章中,写过DUL挖ORACLE 8.0数据库,使用的是dul 8的版本,现在测试dul 10直接支持ORACLE 8.0数据库
数据库版本 ORACLE 8

SVRMGR> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle8 Release 8.0.5.0.0 - Production
PL/SQL Release 8.0.5.0.0 - Production
CORE Version 4.0.5.0.0 - Production
TNS for 32-bit Windows: Version 8.0.5.0.0 - Production
NLSRTL Version 3.3.2.0.0 - Production
5 rows selected.

dul版本 DUL 10

e:\dul10>dul
Data UnLoader: 10.2.0.5.26 - Internal Only - on Sat Feb 15 15:54:15 2014
with 64-bit io functions
Copyright (c) 1994 2014 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
Found db_id = 1207542366
Found db_name = ORCL

DUL读取数据文件

DUL> show datafiles;
ts# rf# start   blocks offs open  err file name
UNK   1     0   102401    0    1    0 C:\Users\XIFENFEI\Desktop\temp\SYS1ORCL.ORA

DUL10参数配置

DUL> show parameter;
_SLPE_DEBUG               = FALSE
ALLOW_CHECKSUM_MISMATCH   = TRUE
ALLOW_DBA_MISMATCH        = TRUE
ALLOW_OTHER_OBJNO         = TRUE
ALLOW_TRAILER_MISMATCH    = TRUE
ASM_DO_HARD_CHECKS        = TRUE
AUTO_UPDATE_CHECKSUM      = TRUE
AUTO_UPDATE_TRAILER       = TRUE
BUFFER                    = 10485760
CF_FILES                  = 1022
CF_TABLESPACES            = 64
COMPATIBLE                = 8
CONTROL_FILE              = control.txt
DB_BLOCK_SIZE             = 2048
DB_NAME                   =
DB_ID                     = 0
DC_COLUMNS                = 100000
DC_OBJECTS                = 128k
DC_TABLES                 = 10000
DC_USERS                  = 1000
DC_SEGMENTS               = 10000
DC_EXTENTS                = 10000
DEFAULT_CHARACTER_SET     =
DEFAULT_NATIONAL_CHARACTER_SET =
EXPORT_MODE               = false
FEEDBACK                  = 1000
FILE                      =
FILE_SIZE_IN_MB           = 0
LDR_ENCLOSE_CHAR          = |
LDR_OUTPUT_IN_UTF8        = FALSE
LDR_PHYS_REC_SIZE         = 0
LOGFILE                   = dul.log
MAX_OPEN_FILES            = 8
OSD_MAX_THREADS           = 1055
OSD_BIG_ENDIAN_FLAG       = false
OSD_DBA_FILE_BITS         = 10
OSD_FILE_LEADER_SIZE      = 1
OSD_C_STRUCT_ALIGNMENT    = 32
OSD_WORD_SIZE             = 32
PARSE_HEX_ESCAPES         = FALSE
RESET_LOGFILE             = FALSE
SCAN_DATABASE_SCANS_LOB_SEGMENTS = TRUE
SCAN_STEP_SIZE            = 512
TRACE_FLAGS               = 0
UNEXP_MAX_ERRORS          = 1000
UNEXP_VERBOSE             = FALSE
USE_LOB_FILES             = TRUE
USE_SCANNED_EXTENT_MAP    = FALSE
VERIFY_NUMBER_PRECISION   = TRUE
WARN_RECREATE_FILES       = TRUE
WRITABLE_DATAFILES        = FALSE

DUL10加载ORACLE 8数据字典

DUL> bootstrap;
Probing file = 1, block = 527
  database version 8 bootstrap$ at file 1, block 352
. unloading table                BOOTSTRAP$
DUL: Warning: block number is non zero but marked deferred trying to process it anyhow
      52 rows unloaded
DUL: Warning: Dictionary cache DC_BOOTSTRAP is empty
Reading BOOTSTRAP.dat 52 entries loaded
Parsing Bootstrap$ contents
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 8
 OBJ$: segobjno 18, file 1 block 167
 TAB$: segobjno 2, tabno 1, file 1  block 52
 COL$: segobjno 2, tabno 5, file 1  block 52
 USER$: segobjno 10, tabno 1, file 1  block 147
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$    3504 rows unloaded
. unloading table                      TAB$     434 rows unloaded
. unloading table                      COL$   16185 rows unloaded
. unloading table                     USER$      24 rows unloaded
Reading USER.dat 24 entries loaded
Reading OBJ.dat 3504 entries loaded and sorted 3504 entries
Reading TAB.dat 434 entries loaded
Reading COL.dat 16185 entries loaded and sorted 16185 entries
Reading BOOTSTRAP.dat 52 entries loaded
DUL: Error: No entry in OBJ$ for "TABCOMPART$" type = 2
DUL: Error: No base dict info for SYS.TABCOMPART$
DUL: Error: No entry in OBJ$ for "INDCOMPART$" type = 2
DUL: Error: No base dict info for SYS.INDCOMPART$
DUL: Error: No entry in OBJ$ for "TABSUBPART$" type = 2
DUL: Error: No base dict info for SYS.TABSUBPART$
DUL: Error: No entry in OBJ$ for "INDSUBPART$" type = 2
DUL: Error: No base dict info for SYS.INDSUBPART$
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 8
 OBJ$: segobjno 18, file 1 block 167
 TAB$: segobjno 2, tabno 1, file 1  block 52
 COL$: segobjno 2, tabno 5, file 1  block 52
 USER$: segobjno 10, tabno 1, file 1  block 147
 TABPART$: segobjno 180, file 1 block 1275
 INDPART$: segobjno 182, file 1 block 1285
 IND$: segobjno 2, tabno 3, file 1  block 52
 ICOL$: segobjno 2, tabno 4, file 1  block 52
 LOB$: segobjno 2, tabno 8, file 1  block 52
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$
DUL: Warning: Recreating file "OBJ.ctl"
    3504 rows unloaded
. unloading table                      TAB$
DUL: Warning: Recreating file "TAB.ctl"
     434 rows unloaded
. unloading table                      COL$
DUL: Warning: Recreating file "COL.ctl"
   16185 rows unloaded
. unloading table                     USER$
DUL: Warning: Recreating file "USER.ctl"
      24 rows unloaded
. unloading table                  TABPART$       0 rows unloaded
. unloading table                  INDPART$       0 rows unloaded
. unloading table                      IND$     525 rows unloaded
. unloading table                     ICOL$     899 rows unloaded
. unloading table                      LOB$      27 rows unloaded
Reading USER.dat 24 entries loaded
Reading OBJ.dat 3504 entries loaded and sorted 3504 entries
Reading TAB.dat 434 entries loaded
Reading COL.dat 16185 entries loaded and sorted 16185 entries
Reading TABPART.dat 0 entries loaded and sorted 0 entries
Reading INDPART.dat 0 entries loaded and sorted 0 entries
Reading IND.dat 525 entries loaded
Reading LOB.dat 27 entries loaded
Reading ICOL.dat 899 entries loaded
Reading BOOTSTRAP.dat 52 entries loaded

DUL 10 unload ORACLE 8 TABLE

DUL> unload table ts$;
. unloading table                       TS$       4 rows unloaded
DUL>

通过测试,证明DUL10可以完美支持ORACLE 8.0数据库,在以后的低版本数据使用dul unload过程中,可以直接使用最新版本,而不用去到处寻找老版本dul

ORACLE 8.0.5 ORA-01207故障恢复

朋友和我说,他的数据库ORACLE 8.0.5出现ORA-01207,进行了尝试恢复但是别未成功,让我协助其完成恢复
数据库版本

SVRMGR> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle8 Release 8.0.5.0.0 - Production
PL/SQL Release 8.0.5.0.0 - Production
CORE Version 4.0.5.0.0 - Production
TNS for 32-bit Windows: Version 8.0.5.0.0 - Production
NLSRTL Version 3.3.2.0.0 - Production
5 rows selected.

open数据库报ORA-01207错误

SVRMGR> alter database open;
alter database open
*
ORA-01122: database file 1 failed verification check
ORA-01110: data file 1: 'D:\ORANT\DATABASE\SYS1ORCL.ORA'
ORA-01207: file is more recent than controlfile - old controlfile

出现该错误的原因是因为控制文件里面的scn或者checkpoint_time>数据文件中的对应值,从而出现该错误,解决方法重建控制文件或者执行recover using backup controlfile 之类命令

重建控制文件,并open报ORA-600[4147]

SVRMGR> alter database backup controlfile to trace;
Statement processed.
SVRMGR> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SVRMGR> STARTUP NOMOUNT
ORACLE instance started.
Total System Global Area                         15077376 bytes
Fixed Size                                          49152 bytes
Variable Size                                    12906496 bytes
Database Buffers                                  2048000 bytes
Redo Buffers                                        73728 bytes
SVRMGR> CREATE CONTROLFILE REUSE DATABASE "ORCL" NORESETLOGS NOARCHIVELOG
     2>     MAXLOGFILES 32
     3>     MAXLOGMEMBERS 2
     4>     MAXDATAFILES 32
     5>     MAXINSTANCES 16
     6>     MAXLOGHISTORY 3260
     7> LOGFILE
     8>   GROUP 1 'D:\ORANT\DATABASE\LOG4ORCL.ORA'  SIZE 1M,
     9>   GROUP 2 'D:\ORANT\DATABASE\LOG3ORCL.ORA'  SIZE 1M,
    10>   GROUP 3 'D:\ORANT\DATABASE\LOG2ORCL.ORA'  SIZE 1M,
    11>   GROUP 4 'D:\ORANT\DATABASE\LOG1ORCL.ORA'  SIZE 1M
    12> DATAFILE
    13>   'D:\ORANT\DATABASE\SYS1ORCL.ORA',
    14>   'D:\ORANT\DATABASE\USR1ORCL.ORA',
    15>   'D:\ORANT\DATABASE\RBS1ORCL.ORA',
    16>   'D:\ORANT\DATABASE\TMP1ORCL.ORA'
    17> ;
Statement processed.
SVRMGR> recover database using backup controlfile;
ORA-00279: change 46960617 generated at 01/31/14 18:51:49 needed for thread 1
ORA-00289: suggestion : D:\ORANT\RDBMS80\ARC12900.1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
D:\ORANT\DATABASE\LOG3ORCL.ORA
Log applied.
Media recovery complete.
SVRMGR> alter database open;
alter database open
*
ORA-00600: internal error code, arguments: [4147], [16], [1], [], [], [], [], []

The ORA-600[4147] basically indicates some kind of corruption with the UNDO (rollback segment)block, most probably due to a lost write to the rollback segment.
ORA-600[4147]是因为回滚段坏块导致(具体是因为undoblock的scn无效),解决方法是用dul找出来回滚段,并屏蔽之

继续恢复报ORA-00600[3668]

SVRMGR> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SVRMGR> startup
ORACLE instance started.
Total System Global Area                         15077376 bytes
Fixed Size                                          49152 bytes
Variable Size                                    12906496 bytes
Database Buffers                                  2048000 bytes
Redo Buffers                                        73728 bytes
Database mounted.
ORA-00600: internal error code, arguments: [3668], [1], [2], [17232], [17232], [4], [], []

ORA-00600[3668]是因为在ORACLE 7.0到9.2的版本中The FIRST time an attempt has been made to start an instance after a CREATE CONTROLFILE command has been issued.
At least one data file needs MEDIA RECOVERY.在9.2.0.x及其以后版本报:ORA-1113: file needs media recovery.
通过重建控制文件,执行recover database,再open数据库恢复成功

关于dul有效期描述

dul是oracle内部工具,用于在数据库不open启动下挖取数据文件内容,从而最小程度减少数据损失.主要可以恢复如下情况:
1.用于异常断电,强制关闭数据库等故障导致数据库使用隐含参数,bbed等各种非常规数据库恢复方法无法正常open的数据库恢复
2.用于恢复无truncate,drop表恢复
3.用于丢失system表空间文件数据库恢复
4.用于大量坏块数据库恢复
5.用于丢失部分非system数据文件恢复
关于dul的相关使用文档请参考:ORACLE DUL汇总,因为dul是oracle内部工具,不需要数据库open就可以获得数据,为了数据安全,dul软件增加了有效期,本文就是对于dul的有效期进行了描述
关于dul的有效期
熟悉dul的人都清楚,dul从10开始就有有效期之说,一般的有效期是1到2个月,极少数情况会到3个月的有效期.通过试验验证

--使用上一期dul,在未修改系统时间的情况下提示过期
e:\dul10>dul
Data UnLoader: 10.2.0.5.25 - Internal Only - on Sat Jan 25 16:17:41 2014
with 64-bit io functions
Copyright (c) 1994 2013 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
Cannot start now
You need a more recent DUL version for this os
--修改系统时间,提示加载成功
e:\dul10>dul
Data UnLoader: 10.2.0.5.25 - Internal Only - on Sat Dec 21 16:21:16 2013
with 64-bit io functions
Copyright (c) 1994 2013 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
--配置dul挖取数据文件,dul又提示和以前一样错误
e:\dul10>dul
Data UnLoader: 10.2.0.5.25 - Internal Only - on Sat Dec 21 16:19:49 2013
with 64-bit io functions
Copyright (c) 1994 2013 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
Cannot open file now
You need a more recent DUL version for this os
--使用最新版dul,在不修改系统时间和配置挖取文件的情况下均正常使用
e:\dul10>dul
Data UnLoader: 10.2.0.5.26 - Internal Only - on Sat Jan 25 16:22:15 2014
with 64-bit io functions
Copyright (c) 1994 2014 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
e:\dul10>dul
Data UnLoader: 10.2.0.5.26 - Internal Only - on Sat Jan 25 16:22:39 2014
with 64-bit io functions
Copyright (c) 1994 2014 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
Found db_id = 4156511432
Found db_name = ORA10G

这里可以知道dul是通过系统时间和数据文件头时间两重保证该软件安全,不能通过修改系统时间来破解

使用DUL挖数据文件恢复非数据外对象方法

在dul进行数据库挖掘恢复的时候,我们可以通过unload table/user等方式来恢复表数据,但是对于一些view,index,trigger,source,seq,Dblink等不能直接通过unload来实现,但是可以通过挖基表来实现相关操作,这里提供了一些处理思路,在实际操作中根据需求,分析数据字典灵活应用
一.view
导出对象

USER$
OBJ$
COL$
VIEW$

执行sql语句

Set pages 10000
Set long 1000
Spool d:\create_view.sql
select
  'CREATE OR REPLACE VIEW '||O.NAME||' ('||
   replace(c.cols,',',','||chr(10))||')'||CHR(10)||
  'as'||chr(10), v.text
from
user$ u, obj$ o, view$ v,
( SELECT COL.OBJ#, COL.COLS
  FROM
  (SELECT
    OBJ#, COL#, substr(SYS_CONNECT_BY_PATH(NAME,','),2) COLS
  FROM COL$
  WHERE COL# > 0
  START WITH COL# = 1
  CONNECT BY PRIOR OBJ# = OBJ# AND PRIOR COL# = COL# - 1 ) COL,
  (SELECT OBJ#, COUNT(*) COLCNT FROM COL$
  WHERE COL# > 0 GROUP BY OBJ#) CN
  WHERE COL.OBJ# = CN.OBJ# AND COL.COL# = CN.COLCNT
) C
where u.user#=o.owner# and o.obj# = c.obj#
  and v.obj# = o.obj# and u.name=upper('&username');

说明
1) 分布执行,不能放置一个脚本文件中执行
2) 每条as后面的select语句可能需要重新格式化
3) Create view 语句最后需要增加”;”

二.source
导出对象

USER$
SOURCE$
OBJ$

执行sql语句

Set pages 10000
Set long 1000
set linse 1000
Spool d:\create_source.sql
SELECT DECODE(S.LINE,1,'CREATE OR REPLACE ','')||SOURCE SOURCE
FROM
  USER$ U, OBJ$  O, SOURCE$ S
WHERE
  U.USER# = O.OWNER# AND
  O.OBJ# = S.OBJ# AND
  U.NAME = UPPER('&username')
  -- AND O.NAME = UPPER('&SOURCE_NAME')
ORDER BY S.OBJ#, S.LINE;

说明
1) 注意SOURCE中的用户名,如果导入不是相同用户,需要修改该脚本用户名
2) 修改完用户名后,直接执行生成脚本即可

三.Index
导出对象

USER$
OBJ$
COL$
IND$
ICOL$

执行sql

Set pages 10000
Set long 1000
set linse 1000
Spool d:\create_index.sql
SELECT
  'CREATE '||decode(bitand(IDX.property, 1), 1, 'UNIQUE', '')||
  ' INDEX '||I.NAME||' ON '||T.NAME||'('||IDX.PATH||');' INDEX_DDL
FROM
  USER$ U, OBJ$  T, OBJ$ I,
  (
    select I.PROPERTY, I.BO#, I.OBJ#, C.POS#,
            SUBSTR(sys_connect_by_path(CN.NAME,','),2) path
    from IND$ I, ICOL$ C, COL$ CN
    WHERE I.OBJ# = C.OBJ# AND I.BO# = C.BO#
      AND I.BO# = CN.OBJ# AND C.COL# = CN.INTCOL#
    start with C.POS#=1
    connect by PRIOR I.OBJ# = I.OBJ#
            AND prior C.POS# = C.POS# - 1 ) IDX,
  (SELECT I.BO#, I.OBJ#, COUNT(*) COLCNT
    FROM ICOL$ I GROUP BY I.BO#, I.OBJ#) IDXC
WHERE
  U.USER# = T.OWNER# AND
  IDX.BO# = T.OBJ# AND
  IDX.OBJ# = I.OBJ# AND
  IDX.BO# =  IDXC.BO# AND
  IDX.OBJ# = IDXC.OBJ# AND
  IDX.POS# = IDXC.COLCNT AND
  U.NAME = upper('&username')
ORDER BY T.NAME, I.NAME;

说明
1) 因为SYS_CONNECT_BY_PATH所以需要10g及其以上版本
2) SQL中没有分区唯一性索引
3) 注意检查sql是否因为行长度不够导致异常

四.Sequence
导出对象

USER$
OBJ$
SEQ$

执行sql语句

Set pages 10000
Set long 1000
set linse 1000
Spool d:\create_sequence.sql
SELECT
  'CREATE SEQUENCE '|| SEQ_NAME ||
  ' MINVALUE '||minval ||
  ' MAXVALUE '||MAXVAL ||
  ' START WITH '||LASTVAL ||
  ' ' || CYC || ' ' || ORD ||
  DECODE(SIGN(CACHE), 1,' CACHE '|| CACHE, 'NOCACHE') ||
  ';' SEQ_DDL
from
  (select u.name OWNER, o.name SEQ_NAME,
      s.minvalue MINVAL, s.maxvalue MAXVAL,
      s.increment$ INC,
      decode (s.cycle#, 0, 'NOCYCLE', 1, 'CYCLE ') CYC,
      decode (s.order$, 0, 'NOORDER', 1, 'ORDER') ORD,
      s.cache, s.highwater LASTVAL
  from seq$ s, obj$ o, user$ u
  where u.user# = o.owner#
    and o.obj# = s.obj#
and u.name=upper('&username'));

五.TRIGGER
导出对象

OBJ$
USER$
TRIGGER$

执行sql语句

Set pages 10000
Set long 1000
set linse 1000
Spool d:\create_trigger.sql
select
   'CREATE OR REPLACE TRIGGER '|| trigger_name || chr(10)||
   decode( substr( trigger_type, 1, 1 ),
           'A', 'AFTER ', 'B', 'BEFORE ', 'I',
           'INSTEAD OF ' ) ||
   triggering_event || ' ON ' || table_owner || '.' ||
   table_name || chr(10) || REF_CLAUSE || chr(10) ||
   decode( instr( trigger_type, 'EACH ROW' ), 0, null,
           'FOR EACH ROW' ), trigger_body
from  (
   select trigusr.name owner, trigobj.name trigger_name,
      decode(t.type#, 0, 'BEFORE STATEMENT',
           1, 'BEFORE EACH ROW',   2, 'AFTER STATEMENT',
           3, 'AFTER EACH ROW',    4, 'INSTEAD OF',
           'UNDEFINED') trigger_type,
   decode(t.insert$*100 + t.update$*10 + t.delete$,
           100, 'INSERT', 010, 'UPDATE', 001, 'DELETE',
           110, 'INSERT OR UPDATE', 101, 'INSERT OR DELETE',
           011, 'UPDATE OR DELETE',
           111, 'INSERT OR UPDATE OR DELETE',
           'ERROR') triggering_event,
   tabusr.name table_owner, tabobj.name table_name,
   'REFERENCING NEW AS '||t.refnewname||' OLD AS '||t.refoldname REF_CLAUSE,
   t.whenclause,decode(t.enabled, 0, 'DISABLED', 1, 'ENABLED', 'ERROR') STATUS,
   t.definition , t.action# trigger_body
   from obj$ trigobj, obj$ tabobj, trigger$ t,
        user$ tabusr, user$ trigusr
   where (trigobj.obj#   = t.obj# and
       tabobj.obj#    = t.baseobject and
       tabobj.owner#  = tabusr.user# and
       trigobj.owner# = trigusr.user# and
       bitand(t.property, 63)     < 8 ))
where table_owner=upper('&username')
order by owner, trigger_name;

六. Dblink
导出对象

Sys.link$
sys.user$

执行查询sql

SELECT 'CREATE '||DECODE(U.NAME,'PUBLIC','public ')||'DATABASE LINK '||CHR(10)
||DECODE(U.NAME,'PUBLIC',Null, 'SYS','',U.NAME||'.')|| L.NAME||chr(10)
||'CONNECT TO ' || L.USERID || ' IDENTIFIED BY "'||L.PASSWORD||'" USING
'''||L.HOST||''''
||chr(10)||';' TEXT
FROM SYS.LINK$ L, SYS.USER$ U
WHERE L.OWNER# = U.USER#;

11.2.0.3 adg库因 bug 16427872 导致smon占用大量cpu

检查数据库发现客户有一套核心的ADG库smon进程负载异常,单进程一直持有cpu 100%

[oracle@q9adg01 trace]$ top -c
top - 14:00:14 up 83 days, 21:39,  4 users,  load average: 10.34, 11.55, 11.25
Tasks: 1162 total,   3 running, 1157 sleeping,   0 stopped,   2 zombie
Cpu(s):  1.7%us,  1.2%sy,  0.0%ni, 86.2%id, 10.7%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  264253752k total, 200445076k used, 63808676k free,   757684k buffers
Swap: 33554424k total,        0k used, 33554424k free,  6529220k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 5707 oracle    25   0  150g  20m  16m R 99.9  0.0  14273:00 ora_smon_q9db1
 5285 oracle    16   0 13564 1952  820 R 31.5  0.0   0:02.49 top -c
 5713 oracle    18   0  150g  20m  17m S  5.3  0.0 410:01.33 ora_asmb_q9db1
 5821 oracle    15   0  150g  23m  17m S  5.3  0.0   4883:29 ora_lck0_q9db1
 7596 oracle    15   0  150g  69m  37m S  5.3  0.0   5368:28 ora_pr00_q9db1
[oracle@q9adg02 ~]$ top -c
top - 14:00:03 up 84 days, 19:36,  3 users,  load average: 6.46, 6.96, 6.76
Tasks: 1045 total,   5 running, 1040 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.8%us,  1.0%sy,  0.0%ni, 93.4%id,  3.7%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  264253752k total, 196879216k used, 67374536k free,   425320k buffers
Swap: 33554424k total,        0k used, 33554424k free,  4727836k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
11615 oracle    25   0  150g  16m  14m R 100.0  0.0  14272:55 ora_smon_q9db2
18173 oracle    16   0  150g  73m  37m D 18.6  0.0  24:33.91 oracleq9db2 (LOCAL=NO)
 6561 oracle    15   0  150g  31m  25m R 12.2  0.0   0:48.50 oracleq9db2 (LOCAL=NO)

数据库版本和patch信息

14:18:05 sys@Q9DB>select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL>  SELECT INST_ID,DATABASE_ROLE,OPEN_MODE FROM GV$DATABASE;
   INST_ID DATABASE_ROLE    OPEN_MODE
---------- ---------------- --------------------
         2 PHYSICAL STANDBY READ ONLY WITH APPLY
         1 PHYSICAL STANDBY READ ONLY WITH APPLY
SQL >select inst_id,STARTUP_TIME from gv$instance;
   INST_ID STARTUP_T
---------- ---------
         2 01-NOV-13
         1 01-NOV-13
[oracle@q9adg01 trace]$ /u01/app/oracle/product/11.2.0/db_1/OPatch/opatch  lspatches
16056266;Database Patch Set Update : 11.2.0.3.6 (16056266)
16315641;Grid Infrastructure Patch Set Update : 11.2.0.3.6 (16083653)

SYSAUX表空间增加数据文件

SQL> select ts# from v$tablespace where name='SYSAUX';
       TS#
----------
         2
SQL> select file#,name,creation_time from v$datafile where ts#=2;
     FILE# NAME                                               CREATION_
---------- -------------------------------------------------- ---------
         3 +DATA/q9db/datafile/sysaux.1412.818566605          12-MAR-08
       151 +DATA/q9db/datafile/sysaux.1431.818566885          26-MAR-12
       221 +DATA/q9db/datafile/sysaux.828.818547945           16-APR-12
      1744 +DATA/q9db_adg/datafile/sysaux.2050.835459505      29-DEC-13

核对数据库确实在2013年12月29日对SYSAUX表空间增加了数据文件而且未重启数据库,触发Bug 16427872 Standby SMON spins on CPU after add/drop SYSAUX datafile on primary
bug-16427872
Bug 16427872 Standby SMON spins on CPU after add/drop SYSAUX datafile on primary
在12.1.0.1中修复,在未修复前增加/删除sysaux的数据文件后,通过重启实例来解决该问题

multipath实现设备用户组设置

现在的Linux系统中,很多都会使用系统自带的multipath多路径软件,在以前的版本中,我们一般通过multipath+udev或者multipath+rc.local来实现多路径和权限设置,而在redhat 5.3-5.11的版本中multipath就直接可以实现多路径聚合、设备持久化、用户组设置
操作系统版本

[root@rac1 dev]# uname -r
2.6.39-300.26.1.el5uek
[root@rac1 dev]# more /etc/issue
Oracle Linux Server release 5.9
Kernel \r on an \m

fdisk记录

[root@rac1 dev]# fdisk -l
…………
Disk /dev/sdh: 134.2 GB, 134217728000 bytes
255 heads, 63 sectors/track, 16317 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
Disk /dev/sdi: 33.5 GB, 33554432000 bytes
64 heads, 32 sectors/track, 32000 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
   Device Boot      Start         End      Blocks   Id  System

multipath包
检查安装multipath相关包(该版本系统默认安装)

[root@rac1 dev]# rpm -aq|grep mapper
device-mapper-multipath-libs-0.4.9-56.0.3.el5
device-mapper-event-1.02.67-2.el5
device-mapper-1.02.67-2.el5
device-mapper-multipath-0.4.9-56.0.3.el5

获取wwid值

[root@rac1 dev]# /sbin/scsi_id -g -u -s /block/sdh
14f504e46494c45527049754962662d395751372d68356743
[root@rac1 dev]# /sbin/scsi_id -g -u -s /block/sdi
14f504e46494c4552484d486249782d464471382d354f4b58

获取uid和gid

[root@rac1 dev]# id grid
uid=1100(grid) gid=54321(oinstall) groups=54321(oinstall),1020(asmadmin),1021(asmdba)

multipath.conf配置

[root@rac1 dev]# vi /etc/multipath.conf
defaults {
user_friendly_names no
queue_without_daemon no
flush_on_last_del yes
max_fds max
}
blacklist {
devnode "^hd[a-z]"
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^cciss.*"
}
devices {
        device {
                vendor                  "OPNFILER "
                product                 "LUN"
                path_grouping_policy    group_by_prio
                features             "3 queue_if_no_path pg_init_retries 50"
                getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
                path_checker            tur
                path_selector           "round-robin 0"
                hardware_handler        "1 alua"
                failback               immediate
                rr_weight              uniform
                rr_min_io             128
        }
}
multipaths {
        multipath {
                wwid                    14f504e46494c45527049754962662d395751372d68356743  #wwid
                alias                   xifenfei128
                uid                     1100                                               #uid
                gid                     1020                                               #gid
        }
        multipath {
                wwid                    14f504e46494c4552484d486249782d464471382d354f4b58   #wwid
                alias                   xifenfei32
                uid                     1100                                                #uid
                gid                     1020                                                #gid
         }
}

启动multipath

[root@rac1 dev]# modprobe dm-multipath
[root@rac1 dev]# modprobe dm-round-robin
[root@rac1 dev]# chkconfig multipathd on
[root@rac1 dev]# service multipathd start
Starting multipathd daemon: [  OK  ]
[root@rac1 dev]# multipath -F
[root@rac1 dev]# multipath -v2
create: xifenfei128 (14f504e46494c45527049754962662d395751372d68356743) undef OPNFILER,VIRTUAL-DISK
size=125G features='0' hwhandler='0' wp=undef
`-+- policy='round-robin 0' prio=1 status=undef
  `- 3:0:0:9  sdh 8:112 undef ready  running
create: xifenfei32 (14f504e46494c4552484d486249782d464471382d354f4b58) undef OPNFILER,VIRTUAL-DISK
size=31G features='0' hwhandler='0' wp=undef
`-+- policy='round-robin 0' prio=1 status=undef
  `- 3:0:0:10 sdi 8:128 undef ready  running

查看生成多路径设备
注意设备名称、组、用户

[root@rac1 dev]# ls -l /dev/mapper/xifenfei*
brw-rw---- 1 grid asmadmin 252, 2 Jan  7 21:21 /dev/mapper/xifenfei128
brw-rw---- 1 grid asmadmin 252, 3 Jan  7 21:21 /dev/mapper/xifenfei32

补充Linux 6.x中udev设置所属组和权限
对于linux 6.x,multipath不能设置磁盘所属组和权限,可以通过udev进行实现,类似配置如下

[root@bxrac03 mapper]#cat 99-diskownership.rules
SUBSYSTEM!="block", GOTO="quickexit"
KERNEL!="dm-*", GOTO="quickexit"
PROGRAM=="/sbin/dmsetup info -c --noheadings -o name -m %m -j %M"
RESULT=="*ocr*", OWNER="grid", GROUP="oinstall", MODE="0660"
RESULT=="*oradata", OWNER="grid", GROUP="oinstall", MODE="0660"
RESULT=="*backup", OWNER="grid", GROUP="oinstall", MODE="0660"
LABEL="quickexit"

其中RESULT和dm的别名向匹配

数据库恢复检查脚本(Oracle Database Recovery Check)

Oracle Database Recovery Check 介绍
根据多年来的数据库恢复经验,提炼出来数据库恢复关键点信息收集脚本(Oracle Database Recovery Check),该脚本主要是在数据库mount状态情况下查询数据库的一些基础表信息等信息,不对数据库进行任何写操作(只做读和dump操作),不会在坏的数据库基础之上带来任何破坏,不影响任何数据库后续的恢复工作。通过该脚本收集信息能够快速定位数据库异常原因,并初步判断数据库恢复疑难程度,减少数据库异常恢复诊断时间,提供恢复效率和准确性。

Oracle Database Recovery Check下载
Oracle Database Recovery Check FOR ORACE 11G及其以前版本下载
Oracle Database Recovery Check For Linux/Uinx
Oracle Database Recovery Check For Windows
Oracle Database Recovery Check For ALL
Oracle Database Recovery Check FOR ORACE 12C下载
Oracle Database Recovery Check For Linux/Uinx
Oracle Database Recovery Check For Windows
Oracle Database Recovery Check For ALL

Oracle Database Recovery Check使用说明
1. 根据系统平台下载对应的版本,相应平台,使用unzip/winrar软件解压成sql文件
2. 上传到服务器之上(oracle用户需要有读取权限)
3. 启动数据库到mount状态
4. 在sqlplus中使用@path/check_recover_db.sql文件
5. 发送xifenfei_db_recover_YYYYMMDD.html到dba@xifenfei.com
6. 联系我:手机(17813235971)或者qq(107644445)

操作演示(所有平台,版本操作相同)
unix/liunx平台使用方法(for 12c以前版本,12c版本操作步骤完全相同)

[oracle@xifenfei ~]$ ls -l /home/oracle/check_recover_db_linux.sql
-rw-r--r-- 1 oracle oinstall 13008 Mar 30 10:36 check_recover_db_linux.sql
[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Sun Mar 30 10:55:58 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup mount
ORACLE instance started.
Total System Global Area  418484224 bytes
Fixed Size                  1385052 bytes
Variable Size             331353508 bytes
Database Buffers           79691776 bytes
Redo Buffers                6053888 bytes
Database mounted.
SQL> @/home/oracle/check_recover_db_linux.sql
+----------------------------------------------------------------------------+
|                   Oracle Database Recovery Check Result                    |
|----------------------------------------------------------------------------+
|  Copyright (c) 2012-2014 xifenfei. All rights reserved. (www.xifenfei.com) |
+----------------------------------------------------------------------------+
Please start the database to mount state.
Note: Do not modify any inspection results
Please refer to the use of the script:http://www.xifenfei.com/5056.html
To send xifenfei_db_recover_YYYYMMDD.html to dba@xifenfei.com or QQ(107644445)
-----Oracle Database Recovery Check STRAT-----
----Starting Collect Data Dictionary Information----
----Starting Collect PATCH Information----
----Starting Collect ALERT LOG Information----
----Starting DUMP file_hdrs Information----
----Starting DUMP controlf Information----
----Starting DUMP redohdr Information----
-----Oracle Database Recovery Check END-----
********************************************************************************
Please check and upload /home/oracle/xifenfei_db_recover_20140330.html
********************************************************************************
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
生成的html文件为:/home/oracle/xifenfei_db_recover_20140330.html

win平台使用方法(for 12c版本,12c以前版本操作步骤完全相同)

C:\Users\XIFENFEI>dir E:\SkyDrive\ORACLE\tools\db_recover\check_recover_db_win_12c.sql
 驱动器 E 中的卷是 本地磁盘
 卷的序列号是 000C-3B41
 E:\SkyDrive\ORACLE\tools\db_recover 的目录
2014-03-29  23:43            11,101 check_recover_db_win_12c.sql
               1 个文件         11,101 字节
               0 个目录 19,557,310,464 可用字节
C:\Users\XIFENFEI>sqlplus / as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on 星期日 3月 30 11:01:37 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
已连接到空闲例程。
SQL> startup mount
ORACLE 例程已经启动。
Total System Global Area  400846848 bytes
Fixed Size                  2440024 bytes
Variable Size             289408168 bytes
Database Buffers          100663296 bytes
Redo Buffers                8335360 bytes
SQL> @E:\SkyDrive\ORACLE\tools\db_recover\check_recover_db_win_12c.sql
+----------------------------------------------------------------------------+
|                   Oracle Database Recovery Check Result                    |
|----------------------------------------------------------------------------+
|  Copyright (c) 2012-2014 xifenfei. All rights reserved. (www.xifenfei.com) |
+----------------------------------------------------------------------------+
Please start the database to mount state.
Note: Do not modify any inspection results
Please refer to the use of the script:http://www.xifenfei.com/5056.html
To send xifenfei_db_recover_YYYYMMDD.html to dba@xifenfei.com or QQ(107644445)
 驱动器 C 中的卷没有标签。
 卷的序列号是 D053-8FE1
 C:\Users\XIFENFEI 的目录
2014-03-30  11:02            32,796 xifenfei_db_recover_20140330.html
               1 个文件         32,796 字节
               0 个目录  8,444,190,720 可用字节
********************************************************************************
Please check and upload "xifenfei_db_recover_YYYYMMDD.html" in current directory
********************************************************************************
从 Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options 断开
生成的html文件为C:\Users\XIFENFEI\xifenfei_db_recover_20140330.html

因asm sga_target设置不当导致11gr2 rac无法正常启动

2014年第一个故障排查和解决:同事反馈给我说solaris 11.2 两节点rac无法启动,让我帮忙看下。通过分析是因为sga_target参数设置不合理导致asm无法正常启动
GI无法正常启动

grid@zwq-rpt1:~$crsctl status resource -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.
grid@zwq-rpt1:~$crsctl status resource -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  OFFLINE                               Instance Shutdown
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       zwq-rpt1
ora.crf
      1        ONLINE  ONLINE       zwq-rpt1
ora.crsd
      1        ONLINE  OFFLINE
ora.cssd
      1        ONLINE  ONLINE       zwq-rpt1
ora.cssdmonitor
      1        ONLINE  ONLINE       zwq-rpt1
ora.ctssd
      1        ONLINE  ONLINE       zwq-rpt1                 ACTIVE:0
ora.diskmon
      1        OFFLINE OFFLINE
ora.evmd
      1        ONLINE  INTERMEDIATE zwq-rpt1
ora.gipcd
      1        ONLINE  ONLINE       zwq-rpt1
ora.gpnpd
      1        ONLINE  ONLINE       zwq-rpt1
ora.mdnsd
      1        ONLINE  ONLINE       zwq-rpt1

asm未正常启动

GI日志报错

2014-01-01 00:40:47.708
[cssd(1418)]CRS-1605:CSSD voting file is online: /dev/rdsk/emcpower0a; details in /export/home/app/grid/log/zwq-rpt1/cssd/ocssd.log.
2014-01-01 00:40:53.234
[cssd(1418)]CRS-1601:CSSD Reconfiguration complete. Active nodes are zwq-rpt1 zwq-rpt2 .
2014-01-01 00:40:56.659
[ctssd(1483)]CRS-2407:The new Cluster Time Synchronization Service reference node is host zwq-rpt2.
2014-01-01 00:40:56.661
[ctssd(1483)]CRS-2401:The Cluster Time Synchronization Service started on host zwq-rpt1.
2014-01-01 00:41:02.016
[ctssd(1483)]CRS-2408:The clock on host zwq-rpt1 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.
2014-01-01 00:43:23.874
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 00:45:42.837
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 00:48:02.087
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 00:48:18.836
[ohasd(1083)]CRS-2807:Resource 'ora.asm' failed to start automatically.
2014-01-01 00:48:18.837
[ohasd(1083)]CRS-2807:Resource 'ora.crsd' failed to start automatically.
2014-01-01 01:05:15.396
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [CRSDG], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 01:05:45.101
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [CRSDG], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 01:06:15.104
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [CRSDG], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".

这里较为明显的看到,因为asm磁盘组异常导致ocr无法被访问导致crs无法正常启动

ORAAGENT日志

2014-01-01 00:43:23.870: [ora.asm][9] {0:0:2} [start] InstConnection::connectInt (2) Exception OCIException
2014-01-01 00:43:23.870: [ora.asm][9] {0:0:2} [start] InstConnection:connect:excp OCIException OCI error 604
2014-01-01 00:43:23.870: [ora.asm][9] {0:0:2} [start] DgpAgent::queryDgStatus excp ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^34f764db","kglHeapInitialize:temp")

报了较为清晰的ORA-4031错误,检查asm日志

ASM日志报错

Wed Jan 01 00:47:33 2014
ORACLE_BASE not set in environment. It is recommended
that ORACLE_BASE be set in the environment
Reusing ORACLE_BASE from an earlier startup = /export/home/app/oracle
Wed Jan 01 00:47:39 2014
Errors in file /export/home/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_1728.trc  (incident=291447):
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^34f764db","kglHeapInitialize:temp")
Incident details in: /export/home/app/oracle/diag/asm/+asm/+ASM1/incident/incdir_291447/+ASM1_ora_1728_i291447.trc
Wed Jan 01 00:47:48 2014
Dumping diagnostic data in directory=[cdmp_20140101004748], requested by (instance=1, osid=1728), summary=[incident=291447].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jan 01 00:47:53 2014
Errors in file /export/home/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_1730.trc  (incident=291448):
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^34f764db","kglHeapInitialize:temp")
Incident details in: /export/home/app/oracle/diag/asm/+asm/+ASM1/incident/incdir_291448/+ASM1_ora_1730_i291448.trc
Wed Jan 01 00:48:01 2014
Dumping diagnostic data in directory=[cdmp_20140101004801], requested by (instance=1, osid=1730), summary=[incident=291448].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jan 01 00:48:07 2014
Errors in file /export/home/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_1732.trc  (incident=291449):
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^34f764db","kglHeapInitialize:temp")
Incident details in: /export/home/app/oracle/diag/asm/+asm/+ASM1/incident/incdir_291449/+ASM1_ora_1732_i291449.trc
Wed Jan 01 00:48:16 2014
Dumping diagnostic data in directory=[cdmp_20140101004816], requested by (instance=1, osid=1732), summary=[incident=291449].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jan 01 00:48:16 2014
License high water mark = 1
USER (ospid: 1736): terminating the instance
Instance terminated by USER, pid = 1736

这里可以清晰的看到,因为shared pool不足,导致asm报ora-4031错误,从而使得asm无法正常启动

分析原因

Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options.
ORACLE_HOME = /export/home/app/grid
System name:	SunOS
Node name:	zwq-rpt1
Release:	5.11
Version:	11.1
Machine:	sun4v
Using parameter settings in server-side spfile +CRSDG/zwq-rpt-cluster/asmparameterfile/registry.253.823992831
System parameters with non-default values:
  sga_max_size             = 2G
  large_pool_size          = 16M
  instance_type            = "asm"
  sga_target               = 0
  remote_login_passwordfile= "EXCLUSIVE"
  asm_diskstring           = "/dev/rdsk/*"
  asm_diskgroups           = "FRADG"
  asm_diskgroups           = "DATADG"
  asm_power_limit          = 1
  diagnostic_dest          = "/export/home/app/oracle"

这里可以看到sga_target被设置为了0,而shared pool又未被配置,这里因为shared pool不足从而出现了ORA-4031,从而导致crs在启动asm的过程失败,从而使得ocr不能被访问,进而使得crs不能正常启动.

处理方法
1.编辑pfile

grid@zwq-rpt1:/export/home/app/oracle/diag/asm/+asm/+ASM1/trace$vi /tmp/asm.pfile
  memory_target = 2G
  large_pool_size          = 16M
  instance_type            = "asm"
  sga_target               = 0
  remote_login_passwordfile= "EXCLUSIVE"
  asm_diskstring           = "/dev/rdsk/*"
  asm_diskgroups           = "FRADG"
  asm_diskgroups           = "DATADG"
  asm_power_limit          = 1
  diagnostic_dest          = "/export/home/app/oracle"

2.启动asm

grid@zwq-rpt1:/export/home/app/oracle/diag/asm/+asm/+ASM1/trace$sqlplus / as sysasm
SQL*Plus: Release 11.2.0.3.0 Production on Wed Jan 1 01:04:10 2014
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup pfile='/tmp/asm.pfile'
ASM instance started
Total System Global Area 2138521600 bytes
Fixed Size                  2161024 bytes
Variable Size            2102806144 bytes
ASM Cache                  33554432 bytes
ASM diskgroups mounted

3. 创建spfile

SQL> create spfile='+CRSDG' FROM PFILE='/tmp/asm.pfile';
File created.
--asm alert日志
Wed Jan 01 01:08:59 2014
NOTE: updated gpnp profile ASM SPFILE to
NOTE: updated gpnp profile ASM diskstring: /dev/rdsk/*
NOTE: updated gpnp profile ASM diskstring: /dev/rdsk/*
NOTE: updated gpnp profile ASM SPFILE to +CRSDG/zwq-rpt-cluster/asmparameterfile/registry.253.835664939

4. 关闭asm

SQL> shutdown immediate
ORA-15097: cannot SHUTDOWN ASM instance with connected client (process 1971)
SQL> shutdown abort
ASM instance shutdown

5. 重启crs

root@zwq-rpt1:~# crsctl stop crs -f
root@zwq-rpt1:~# crsctl start crs

6. 重启其他节点crs

root@zwq-rpt2:~# crsctl stop crs -f
root@zwq-rpt2:~# crsctl start crs

7. 检查结果

root@zwq-rpt1:~# crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRSDG.dg
               ONLINE  ONLINE       zwq-rpt1
               ONLINE  ONLINE       zwq-rpt2
ora.DATADG.dg
               ONLINE  ONLINE       zwq-rpt1
               ONLINE  ONLINE       zwq-rpt2
ora.FRADG.dg
               ONLINE  ONLINE       zwq-rpt1
               ONLINE  ONLINE       zwq-rpt2
ora.LISTENER.lsnr
               ONLINE  ONLINE       zwq-rpt1
               ONLINE  ONLINE       zwq-rpt2
ora.asm
               ONLINE  ONLINE       zwq-rpt1                 Started
               ONLINE  ONLINE       zwq-rpt2                 Started
ora.gsd
               OFFLINE OFFLINE      zwq-rpt1
               OFFLINE OFFLINE      zwq-rpt2
ora.net1.network
               ONLINE  ONLINE       zwq-rpt1
               ONLINE  ONLINE       zwq-rpt2
ora.ons
               ONLINE  ONLINE       zwq-rpt1
               ONLINE  ONLINE       zwq-rpt2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       zwq-rpt1
ora.cvu
      1        ONLINE  ONLINE       zwq-rpt1
ora.oc4j
      1        ONLINE  ONLINE       zwq-rpt1
ora.rptdb.db
      1        ONLINE  ONLINE       zwq-rpt1                 Open
      2        ONLINE  ONLINE       zwq-rpt2                 Open
ora.scan1.vip
      1        ONLINE  ONLINE       zwq-rpt1
ora.zwq-rpt1.vip
      1        ONLINE  ONLINE       zwq-rpt1
ora.zwq-rpt2.vip
      1        ONLINE  ONLINE       zwq-rpt2

至此恢复正常,2014年第一个故障顺利解决

ORACLE DUL汇总

oracle数据库恢复三板斧,最大限度减少因为ORACLE不能open导致的数据损失
第一板:HIDE PARAMETER AND EVENT
第二板:BBED
第三板:DUL

当我们使用第一和第二板斧头无法解决问题之时,我们就需要考虑使用ORACLE数据库恢复终极工具DUL,这里对于dul的相关测试进行总结,便于查询
dul处理分区表
Oracle dul支持18c
关于dul有效期描述
dul恢复drop表测试
dul抽取异常asm文件
oracle dul 11 正式发布
dul 10支持oracle 11g r2
使用dul恢复asm中数据
dul恢复truncate表测试
使用 dul 挖数据文件初试
DUL挖ORACLE 8.0数据库
dul实现对数据文件内容更新
DUL10直接支持ORACLE 8.0
Oracle dul支持Oracle 12.2(12c)
最新版Oracle dul支持Oracle 7.2.3
dul 10 export_mode=true功能增强
dul实现exp dump文件转换sqlldr格式
dul支持ORACLE 12C CDB数据库恢复
dul实现expdp dump文件转换sqlldr格式
使用DUL挖数据文件恢复非数据外对象方法
dul无法加载bootstrap实现unload table/user恢复
为推进国内DUL的发展,欢迎在DUL使用过程中的问题探讨.如果在数据库恢复中需要使用dul,或者你们使用dul无法解决问题,欢迎联系我们(专业Oracle数据库恢复技术支持)协助你解决问题