物理备库在read only时报ORA-01552错误处理

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:物理备库在read only时报ORA-01552错误处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

物理备库在read only时报ORA-01552错误

Tue Jan 06 11:53:38 中国标准时间 2015
alter database open read only
Tue Jan 06 11:53:38 中国标准时间 2015
SMON: enabling cache recovery
Tue Jan 06 11:53:39 中国标准时间 2015
Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan
replication_dependency_tracking turned off (no async multimaster replication found)
Physical standby database opened for read only access.
Completed: alter database open read only
Tue Jan 06 11:54:04 中国标准时间 2015
Errors in file c:\oracle\product\10.2.0\admin\ntsy\udump\ntsy_ora_9080.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01552: 非系统表空间 'MY_SPACE' 不能使用系统回退段
ORA-06512: 在 line 2

分析trace文件

*** ACTION NAME:() 2015-01-06 11:54:04.828
*** MODULE NAME:(sqlplus.exe) 2015-01-06 11:54:04.828
*** SERVICE NAME:(SYS$USERS) 2015-01-06 11:54:04.828
*** SESSION ID:(1284.9) 2015-01-06 11:54:04.828
Error in executing triggers on connect internal
*** 2015-01-06 11:54:04.828
ksedmp: internal or fatal error
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01552: 非系统表空间 'MY_SPACE' 不能使用系统回退段
ORA-06512: 在 line 2
*** 2015-01-06 11:54:05.843
Process diagnostic dump for ORACLE.EXE (MMNL), OS id=10492,
pid: 13, proc_ser: 1, sid: <no session>

这里可以看出来,是由于执行触发器导致该问题,根据经验第一感觉很可能是logon之类的触发器导致。

查询触发器

SQL> select trigger_name,trigger_type,OWNER from dba_triggers where owner='OP';
TRIGGER_NAME                   TRIGGER_TYPE     OWNER
------------------------------ ---------------- ------------------------------
LOGAD                          AFTER EVENT      OP
TR_TRACE_DDL                   AFTER EVENT      OP

只有这两个触发器是基于事件的,另外从名字和dba_source中确定

SQL> select text from dba_source where name='LOGAD';
TEXT
--------------------------------------------------------------------------------
TRIGGER "OP".logad after logon on database
begin
insert into logad values (SYS_CONTEXT('USERENV', 'SESSION_USER'), SYSDATE,SYS_CO
NTEXT('USERENV','IP_ADDRESS')) ;
end;
已选择6行。
SQL> select text from dba_source where name='TR_TRACE_DDL';
TEXT
--------------------------------------------------------------------------------
TRIGGER "OP".tr_trace_ddl
AFTER ddl
ON database
DECLARE
sql_text ora_name_list_t;
state_sql ddl$trace.ddl_sql%TYPE;
BEGIN
FOR i IN 1..ora_sql_txt(sql_text) LOOP
state_sql := state_sql||sql_text(i);
END LOOP;
TEXT
--------------------------------------------------------------------------------
INSERT INTO ddl$trace(login_user,audsid,machine,ipaddress,
schema_user,schema_object,ddl_time,ddl_sql)
VALUES(ora_login_user,userenv('SESSIONID'),sys_context('userenv','host'),
sys_context('userenv','ip_address'),ora_dict_obj_owner,ora_dict_obj_name,SYSDATE
,state_sql);
EXCEPTION
WHEN OTHERS THEN
-- sp_write_log('捕获DDL语句异常错误:'||SQLERRM);
null;
END tr_trace_ddl;

基本上确定LOGAD是登录触发器,tr_trace_ddl是记录ddl触发器,那现在问题应该出在LOGAD的触发器上.因为该触发器在备库上当有用户登录之时,他也会工作插入记录到logad表中,由于数据库是只读,因此就出现了类似ORA-01552错误

解决方法
在触发器中加判断数据库角色条件,当数据库为物理备库之时才执行dml操作

SQL> CREATE OR REPLACE TRIGGER "OP".logad
  2  AFTER LOGON on database
  3  declare
  4  db_role varchar2(30);
  5  begin
  6  select database_role into db_role from v$database;
  7  If db_role <> 'PHYSICAL STANDBY' then
  8  insert into op.logad values (SYS_CONTEXT('USERENV', 'SESSION_USER'),
  9  SYSDATE,SYS_CONTEXT('USERENV','IP_ADDRESS')) ;
 10  end if;
 11  end;
 12  /
Warning: Trigger created with compilation errors.
SQL> show error;
Errors for TRIGGER "OP".logad:
LINE/COL ERROR
-------- -----------------------------------------------------------------
4/1      PL/SQL: SQL Statement ignored
4/40     PL/SQL: ORA-00942: table or view does not exist
SQL> conn / as sysdba
Connected.
SQL> grant select on v_$database to op;
Grant succeeded.
SQL> CREATE OR REPLACE TRIGGER "OP".logad
  2  AFTER LOGON on database
  3  declare
  4  db_role varchar2(30);
  5  begin
  6  select database_role into db_role from v$database;
  7  If db_role <> 'PHYSICAL STANDBY' then
  8  insert into op.logad values (SYS_CONTEXT('USERENV', 'SESSION_USER'),
  9 SYSDATE,SYS_CONTEXT('USERENV','IP_ADDRESS')) ;
 10  end if;
 12  end;
 12  /
Trigger created.

数据库open正常

Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE cancel
Tue Jan 06 13:51:20 中国标准时间 2015
alter database open read only
Tue Jan 06 13:51:21 中国标准时间 2015
SMON: enabling cache recovery
Tue Jan 06 13:51:21 中国标准时间 2015
Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan
replication_dependency_tracking turned off (no async multimaster replication found)
Physical standby database opened for read only access.
Tue Jan 06 13:51:23 中国标准时间 2015
db_recovery_file_dest_size of 102400 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Tue Jan 06 13:51:23 中国标准时间 2015
Completed: alter database open read only

升级数据库到10.2.0.5遭遇ORA-00918: column ambiguously defined

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:升级数据库到10.2.0.5遭遇ORA-00918: column ambiguously defined

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个数据库从10201升级到10205之后,出现ORA-00918错误,查询mos发现在以前版本中是bug,Oracle好像在10205中把它修复了,结果就是以前应用的sql无法正常执行.这次升级的结果就是客户晚上3点联系开发商紧急修改程序。再次提醒:再小的系统数据库升级都需要做,功能测试,SPA测试,确保升级后功能和性能都正常.

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for 64-bit Windows: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production

执行报错ORA-00918
多个表JOIN连接,由于在select中的列未指定表名,而且该列在多个表中有,因此在10205中报ORA-00918错误,Oracle认为在以前的版本中是 Bug 5368296: SQL NOT GENERATING ORA-918 WHEN USING JOIN. 升级到10.2.0.5, 11.1.0.7 and 11.2.0.2版本,需要注意此类问题。修复bug没事,但是修复了之后导致系统需要修改sql才能够运行,确实让人很无语

SQL> set autot trace
SQL> set lines 100
SQL> SELECT   yz_id, item_code, DECODE (yzlx, 0, '长期医嘱', '临时医嘱') yzlx,
  2             item_name, gg, sl || sldw sl, zyjs, yf, a.pc, zbj, zbh,
  3             TO_CHAR (dcl, 'fm9999990.009') || dcldw dcl, a.bz, lb, zyh,ch,xm,
  4             bq, cfh, lrysdm, lrysxm, lrrq, hdrdm, hdrxm, hdrq, sender_code,
  5             sender_name, send_date, tzysdm, tzysxm, tzrq, ksrq, zxfy,
  6             lb_yp_yl, zsq_code
  7        FROM op.yz a LEFT OUTER JOIN op.pc b
  8             ON NVL (TRIM (UPPER (a.pc)), ' ') = NVL (TRIM (UPPER (b.pc)), ' ')
  9             LEFT JOIN op.zy p ON a.zyh = p.zyh
 10       WHERE p.cy='在院' AND p.new_patient='1'
 11         AND upper(nvl(p.bj,1))<> 'Y'
 12         AND (state = '已核对')
 13         AND is_in_bill IS NULL
 14    ORDER BY ksrq, yz_id ;
           bq, cfh, lrysdm, lrysxm, lrrq, hdrdm, hdrxm, hdrq, sender_code,
           *
ERROR at line 4:
ORA-00918: column ambiguously defined
SQL> select COLUMN_NAME,TABLE_NAME from DBA_tab_columns where column_name='BQ'
  2  AND TABLE_NAME IN('YZ','ZY','PC');
COLUMN_NAME                    TABLE_NAME
------------------------------ ------------------------------
BQ                             ZY
BQ                             YZ

10.2.0.1中执行正常

E:\>sqlplus / as sysdba
SQL*Plus: Release 10.2.0.1.0 - Production on 星期六 1月 3 14:09:51 2015
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
连接到:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
SQL> set autot trace
SQL> set lines 100
SQL> SELECT   yz_id, item_code, DECODE (yzlx, 0, '长期医嘱', '临时医嘱') yzlx,
  2             item_name, gg, sl || sldw sl, zyjs, yf, a.pc, zbj, zbh,
  3             TO_CHAR (dcl, 'fm9999990.009') || dcldw dcl, a.bz, lb, zyh,ch,xm
,
  4             bq, cfh, lrysdm, lrysxm, lrrq, hdrdm, hdrxm, hdrq, sender_code,
  5             sender_name, send_date, tzysdm, tzysxm, tzrq, ksrq, zxfy,
  6             lb_yp_yl, zsq_code
  7        FROM op.yz a LEFT OUTER JOIN op.pc b
  8             ON NVL (TRIM (UPPER (a.pc)), ' ') = NVL (TRIM (UPPER (b.pc)), '
')
  9             LEFT JOIN op.zy p ON a.zyh = p.zyh
 10       WHERE p.cy='在院' AND p.new_patient='1'
 11         AND upper(nvl(p.bj,1))<> 'Y'
 12         AND (state = '已核对')
 13         AND is_in_bill IS NULL
 14    ORDER BY ksrq, yz_id ;
已选择19804行。
执行计划
----------------------------------------------------------
ERROR:
ORA-00604: 递归 SQL 级别 2 出现错误
ORA-16000: 打开数据库以进行只读访问
SP2-0612: 生成 AUTOTRACE EXPLAIN 报告时出错
统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      41945  consistent gets
          0  physical reads
          0  redo size
    2075973  bytes sent via SQL*Net to client
      14989  bytes received via SQL*Net from client
       1322  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      19804  rows processed

10.2.0.5库中同名列增加表名前缀执行OK


SQL> set autot trace
SQL> set lines 100
SQL> SELECT   yz_id, item_code, DECODE (yzlx, 0, '长期医嘱', '临时医嘱') yzlx,
  2             item_name, gg, sl || sldw sl, zyjs, yf, a.pc, zbj, zbh,
  3             TO_CHAR (dcl, 'fm9999990.009') || dcldw dcl, a.bz, lb,zyh,ch,xm,
  4             a.bq, cfh, lrysdm, lrysxm, lrrq, hdrdm, hdrxm, hdrq, sender_code,
  5             sender_name, send_date, tzysdm, tzysxm, tzrq, ksrq, zxfy,
  6             lb_yp_yl, zsq_code
  7        FROM op.yz a LEFT OUTER JOIN op.pc b
  8             ON NVL (TRIM (UPPER (a.pc)), ' ') = NVL (TRIM (UPPER (b.pc)), ' ')
  9             LEFT JOIN op.zy p ON a.zyh = p.zyh
 10       WHERE p.cy='在院' AND p.new_patient='1'
 11         AND upper(nvl(p.bj,1))<> 'Y'
 12         AND (state = '已核对')
 13         AND is_in_bill IS NULL
 14    ORDER BY ksrq, yz_id ;
20629 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3468887510
--------------------------------------------------------------------------------------------
| Id  | Operation                     | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |            |    10 |  2580 |  2968   (2)| 00:00:36 |
|   1 |  SORT ORDER BY                |            |    10 |  2580 |  2968   (2)| 00:00:36 |
|*  2 |   HASH JOIN OUTER             |            |    10 |  2580 |  2967   (2)| 00:00:36 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| YZ         |     3 |   672 |    42   (0)| 00:00:01 |
|   4 |     NESTED LOOPS              |            |    10 |  2390 |  2963   (2)| 00:00:36 |
|*  5 |      TABLE ACCESS FULL        | ZY         |     3 |    45 |  2917   (2)| 00:00:36 |
|*  6 |      INDEX RANGE SCAN         | DZBLYZ_ZYH |   118 |       |     2   (0)| 00:00:01 |
|   7 |    TABLE ACCESS FULL          | PC         |    33 |   627 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(NVL(TRIM(UPPER("A"."PC")),' ')=NVL(TRIM(UPPER("B"."PC"(+))),' '))
   3 - filter("A"."STATE"='已核对' AND "A"."IS_IN_BILL" IS NULL)
   5 - filter("P"."CY"='在院' AND UPPER(NVL("P"."BJ",'1'))<>'Y' AND
              "P"."NEW_PATIENT"='1')
   6 - access("A"."ZYH"="P"."ZYH")
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      42121  consistent gets
          0  physical reads
          0  redo size
    2181383  bytes sent via SQL*Net to client
      15617  bytes received via SQL*Net from client
       1377  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      20629  rows processed

Bug 5368296: SQL NOT GENERATING ORA-918 WHEN USING JOIN
Bug 12388159 : SQL REPORTING ORA00918 AFTER UPGRADE TO 10.2.0.5.0
再次提醒:再小的系统数据库升级都需要做,功能测试,SPA测试,确保升级后功能和性能都正常.

查询v$session报ORA-04031错误

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:查询v$session报ORA-04031错误

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户的数据库在出账期间有工具登录Oracle数据库偶尔性报ORA-04031,经过分析是因为该工具需要查询v$session,经过分析确定是Bug 12808696 – Shared pool memory leak of “hng: All sessi” memory (Doc ID 12808696.8),重现错误如下
节点1进行查询报ORA-4031

SQL> select count(*) from v$session; 
COUNT(*) 
---------- 
1536 
SQL> select count(*) from gv$session; 
COUNT(*) 
---------- 
2089 
SQL> select /*+ full(t) */ count(*) from gv$session t; 
COUNT(*) 
---------- 
2053 
SQL> select * from gv$session; 
select * from gv$session 
* 
ERROR at line 1: 
ORA-12801: error signaled in parallel query server PZ93, instance 
ocs_db_2:zjocs2 (2) 
ORA-04031: unable to allocate 308448 bytes of shared memory ("shared 
pool","unknown object","sga heap(1,0)","hng: All sessions data for API.")

节点2进行查询报ORA-04031

SQL> select * from gv$session; 
select * from gv$session 
* 
ERROR at line 1: 
ORA-12801: error signaled in parallel query server PZ95, instance
ocs_db_2:zjocs2 (2)
ORA-04031: unable to allocate 308448 bytes of shared memory ("shared
pool","unknown object","sga heap(6,0)","hng: All sessions data for API.")
SQL> select * from v$session; 
select * from v$session 
* 
ERROR at line 2:
ORA-04031: unable to allocate 308448 bytes of shared memory ("shared
pool","unknown object","sga heap(7,0)","hng: All sessions data for API.")

通过上述分析:确认是节点2的v$session遭遇到Bug 12808696,导致在该节点中中查询v$session和Gv$session报ORA-04031,而在节点1中查询v$session正常,查询Gv$session报ORA-04031.
bug-12808696
该bug在11.1.0.6中修复,所有的10g版本中未修复,只能通过临时重启来暂时避免,注意该bug通过flash shared_pool无法解决
如果您有权限可以进步一查询SR 3-7670890781: 查询v$session的BLOCKING_SESSION字段时,出现ora-04031错误

数据库启动报ORA-00704 ORA-39714错误解决

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:数据库启动报ORA-00704 ORA-39714错误解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动失败,报ORA-00704、ORA-39714错误

[oracle@www.xifenfei.com ~]$ sqlplus / as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Thu Aug 7 08:15:35 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
SQL> startup
ORACLE instance started.
Total System Global Area  663945216 bytes
Fixed Size                  2291808 bytes
Variable Size             369100704 bytes
Database Buffers          289406976 bytes
Redo Buffers                3145728 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-39714: upgrade script utlmmig.sql failed
Process ID: 11592
Session ID: 1 Serial number: 5
SQL> startup upgrade
SP2-0642: SQL*Plus internal error state 2133, context 3114:0:0
Unsafe to proceed
ORA-03114: not connected to ORACLE
SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

alert日志报错

Thu Aug 07 07:42:25 2014
SMON: enabling cache recovery
Thu Aug 07 07:42:25 2014
Errors in file /u01/app/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ora_11592.trc:
ORA-39714: upgrade script utlmmig.sql failed
Thu Aug 07 07:42:25 2014
Errors in file /u01/app/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ora_11592.trc:
ORA-00704: bootstrap process failure
ORA-39714: upgrade script utlmmig.sql failed
Thu Aug 07 07:42:25 2014
Errors in file /u01/app/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ora_11592.trc:
ORA-00704: bootstrap process failure
ORA-39714: upgrade script utlmmig.sql failed
Thu Aug 07 07:42:25 2014
Error 704 happened during db open, shutting down database
USER (ospid: 11592): terminating the instance due to error 704

通过分析utlmmig.sql脚本知道,数据库在升级bootstrap$之前会先在props$表中插入BOOTSTRAP_UPGRADE_ERROR相关记录,数据库在启动之时会检测该值,如果发现该值存在,数据库只能以upgrade模式启动,清理掉相关记录,数据库即可正常启动

[oracle@www.xifenfei.com ~]$ sqlplus / as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Thu Aug 7 07:42:44 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup upgrade
ORACLE instance started.
Total System Global Area  663945216 bytes
Fixed Size                  2291808 bytes
Variable Size             369100704 bytes
Database Buffers          289406976 bytes
Redo Buffers                3145728 bytes
Database mounted.
Database opened.
SQL>   delete from props$ where name = 'BOOTSTRAP_UPGRADE_ERROR';
1 row deleted.
SQL>   delete from props$ where name = 'LOGMNR_BOOTSTRAP_UPGRADE_ERROR';
0 rows deleted.
SQL>   commit;
Commit complete.
SQL>
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area  663945216 bytes
Fixed Size                  2291808 bytes
Variable Size             369100704 bytes
Database Buffers          289406976 bytes
Redo Buffers                3145728 bytes
Database mounted.
Database opened.
SQL>

数据库虽然正常启动成功,但是由于bootstrap$对象升级失败,后续还是有很大风险,建议分析报错原因,解决原因然后继续升级bootstrap$基表

ORA-38856: cannot mark instance UNNAMED_INSTANCE_2 (redo thread 2) as enabled

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-38856: cannot mark instance UNNAMED_INSTANCE_2 (redo thread 2) as enabled

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在昨天11.2.0.2 for Linux 数据库恢复过程中,把数据文件从asm复制到单节点机器中恢复,在resetlogs过程中报如下ORA-38856错误

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-38856: cannot mark instance UNNAMED_INSTANCE_2 (redo thread 2) as enabled

ORA-38856 is the expected error during open database resetlogs when the set of enabled instances (redo threads) in the controlfile does not match the set of enabled instances (redo threads) in datafile checkpoint. This is expected behavior in a normal RAC restore/recover/open resetlogs situation.
这句话的意思是:数据库在resetlogs的时候发现控制文件中的redo threads和数据文件汇总的redo threads不一致,从而出现该问题.
在本次恢复中禁用了所有和thread 2相关参数,数据库依然报告错误,是因为数据库在异常恢复过程中需要读取节点2的redo信息,现在无法读取从而出现该错误.但是使用了_allow_resetlogs_corruption 之后还是报该错误,实在诡异.通过查询mos发现有类似Unpublished Bug 4355382 ORA-38856: FAILED TO OPEN DATABASE WITH RESETLOGS WHEN USING RAC BACKUP,虽然说该bug在10.2.0.3中修复,但是在异常恢复过程中,本着在风险可控的情况下,大胆尝试,继续使用_no_recovery_through_resetlogs,数据库正常resetlogs成功.
可以参考:RMAN Duplicate from RAC backup fails ORA-38856 (Doc ID 334899.1)

创建控制文件出现ORA-01565 ORA-27041 OSD-04002

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:创建控制文件出现ORA-01565 ORA-27041 OSD-04002

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

oracle 在win平台上创建控制文件可能会出现ORA-01565 ORA-27041 OSD-04002错误

C:\Users\feicheng>sqlplus  / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on 星期六 9月 13 16:20:38 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> startup nomount;
ORACLE 例程已经启动。
Total System Global Area  400846848 bytes
Fixed Size                  2281656 bytes
Variable Size             188747592 bytes
Database Buffers          201326592 bytes
Redo Buffers                8491008 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 292
  7  LOGFILE
  8    GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  9    GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
 10    GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
 11  DATAFILE
 12    'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
 13    'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
 14    'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
 15    'D:\惜分飞\USERS01.DBF'
 16  CHARACTER SET ZHS16GBK
 17  ;
CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
*
第 1 行出现错误:
ORA-01503: CREATE CONTROLFILE ??
ORA-01565: ???? 'D:\???\USERS01.DBF' ???
ORA-27041: ??????
OSD-04002: ????????????
O/S-Error: (OS 123) ????????????????????????????????

alert日志对应错误提示为

Sat Sep 13 16:27:48 2014
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 shared server(s) ...
ORACLE_BASE from environment = D:\oracle
Sat Sep 13 16:28:11 2014
CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
DATAFILE
  'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
  'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
  'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
  'D:\???\USERS01.DBF'
CHARACTER SET ZHS16GBK
WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Errors in file D:\ORACLE\diag\rdbms\xff\xff\trace\xff_ora_8136.trc:
ORA-01565: ???? 'D:\???\USERS01.DBF' ???
ORA-27041: ??????
OSD-04002: 无法打开文件
O/S-Error: (OS 123) 文件名、目录名或卷标语法不正确。
ORA-1503 signalled during: CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
DATAFILE
  'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
  'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
  'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
  'D:\???\USERS01.DBF'
CHARACTER SET ZHS16GBK

ORA-01565 ORA-27041 OSD-04002的含义大致为:在创建控制文件的时候,有数据文件无法不存在.
另外在alert日志里面也可以看到,sqlplus中的”D:\惜分飞\USERS01.DBF”变为了”D:\???\USERS01.DBF”导致无法定位到数据文件,从而在创建数据文件之时出现ORA-01565 ORA-27041 OSD-04002错误.
解决放方法:
1.创建控制文件语句中不含中文

C:\Users\feicheng>sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on 星期六 9月 13 16:32:09 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
已连接到空闲例程。
SQL> STARTUP NOMOUNT
ORACLE 例程已经启动。
Total System Global Area  400846848 bytes
Fixed Size                  2281656 bytes
Variable Size             188747592 bytes
Database Buffers          201326592 bytes
Redo Buffers                8491008 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 292
  7  LOGFILE
  8    GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  9    GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
 10    GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
 11  DATAFILE
 12    'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
 13    'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
 14    'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
 15    'D:\xifenfei\USERS01.DBF'
 16  CHARACTER SET ZHS16GBK
 17  ;
控制文件已创建。

2.设置nls_lang为american_america.ZHS16GBK

C:\Users\feicheng>set NLS_LANG=american_america.ZHS16GBK
C:\Users\feicheng>sqlplus  / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Sat Sep 13 16:29:04 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 292
  7  LOGFILE
  8    GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  9    GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
 10    GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
 11  DATAFILE
 12    'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
 13    'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
 14    'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
 15    'D:\惜分飞\USERS01.DBF'
 16  CHARACTER SET ZHS16GBK
 17  ;
Control file created.

此时alert日志提示

Sat Sep 13 16:29:22 2014
CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
DATAFILE
  'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
  'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
  'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
  'D:\惜分飞\USERS01.DBF'
CHARACTER SET ZHS16GBK
WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Sat Sep 13 16:29:25 2014
Successful mount of redo thread 1, with mount id 3507744098
Completed: CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
DATAFILE
  'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
  'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
  'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
  'D:\惜分飞\USERS01.DBF'
CHARACTER SET ZHS16GBK

通过此实验简单说明:在oracle使用该过程中,尽可能少用中文路径或者文件名

ORACLE 12C ORA-07445[ktuHistRecUsegCrtMain()+1173]恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORACLE 12C ORA-07445[ktuHistRecUsegCrtMain()+1173]恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在ORACLE 12C的使用中,遇到ORA-07445: 出现异常错误: 核心转储 [ktuHistRecUsegCrtMain()+1173]的错误,通过10046分析是在查询cdef$表的时候异常,导致数据库无法正常open,通过recover database,然后open成功,怀疑是oracle 12C bug,但是mos中未查询到相关记录,后续不知道是否会出现相关文档说明该问题.
数据库版本

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0
PL/SQL Release 12.1.0.1.0 - Production                                                    0
CORE    12.1.0.1.0      Production                                                        0
TNS for 64-bit Windows: Version 12.1.0.1.0 - Production                                   0
NLSRTL Version 12.1.0.1.0 - Production                                                    0

oracle 12c启动报错

idle> startup
ORACLE 例程已经启动。
Total System Global Area  400846848 bytes
Fixed Size                  2440024 bytes
Variable Size             281019560 bytes
Database Buffers          109051904 bytes
Redo Buffers                8335360 bytes
数据库装载完毕。
ORA-03113: 通信通道的文件结尾
进程 ID: 5972
会话 ID: 242 序列号: 3

alert日志报ORA-07445: 出现异常错误: 核心转储 [ktuHistRecUsegCrtMain()+1173]错

Sun Mar 30 22:35:22 2014
SMON: enabling cache recovery
Sun Mar 30 22:35:23 2014
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x6060] [PC:0x140808585, ktuHistRecUsegCrtMain()+1173]
Errors in file E:\APP\XIFENFEI\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_5192.trc  (incident=45790):
ORA-07445: 出现异常错误: 核心转储 [ktuHistRecUsegCrtMain()+1173] [ACCESS_VIOLATION] [ADDR:0x6060] [PC:0x140808585] [UNABLE_TO_READ] []
Incident details in: E:\APP\XIFENFEI\diag\rdbms\xifenfei\xifenfei\incident\incdir_45790\xifenfei_ora_5192_i45790.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sun Mar 30 22:35:28 2014
Instance Critical Process (pid: 22, ospid: 5192) died unexpectedly
PMON (ospid: 8420): terminating the instance due to error 12752

恢复数据库

idle> recover database;
完成介质恢复。
idle> alter database open;
数据库已更改。

ORA-00600[kcrf_resilver_log_1]异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-00600[kcrf_resilver_log_1]异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

朋友在win x64位上的ORACLE 11.2.0.1启动出现ORA-00600[kcrf_resilver_log_1],让我帮忙看看,通过分析主要是因为Unpblished Bug 9056657导致
数据库启动报错
数据库在open的时候报ORA-00600[kcrf_resilver_log_1]

SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-00600: 内部错误代码, 参数: [kcrf_resilver_log_1], [0x7FF61C56E30], [2], [],
[], [], [], [], [], [], [], []

alert日志报错

Sat Mar 01 18:40:44 2014
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 3 processes
Started redo scan
Errors in file f:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_6432.trc  (incident=61360):
ORA-00600: 内部错误代码, 参数: [kcrf_resilver_log_1], [0x7FF61C56E30], [2], [], [], [], [], [], [], [], [], []
Incident details in: f:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_61360\orcl_ora_6432_i61360.trc
Aborting crash recovery due to error 600
Errors in file f:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_6432.trc:
ORA-00600: 内部错误代码, 参数: [kcrf_resilver_log_1], [0x7FF61C56E30], [2], [], [], [], [], [], [], [], [], []
Errors in file f:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_6432.trc:
ORA-00600: 内部错误代码, 参数: [kcrf_resilver_log_1], [0x7FF61C56E30], [2], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: alter database open...

分先相关SCN
控制文件scn
1
控制文件中数据文件scn
2
数据文件头scn3
通过这里可以知道,数据文件头的scn,控制文件中关于数据文件的scn都表明数据库为正常关闭,且scn值为16574746,但是控制文件中记录数据库SCN的值为16551515,可以判断数据库因为某种原因导致控制文件中的部分scn记录异常.

处理方法
因为控制文件SCN异常,考虑直接重建控制文件或者using backup controlfile方式恢复

SQL> select group#,status,sequence# from v$log;
    GROUP# STATUS            SEQUENCE#
---------- ---------------- ----------
         1 CURRENT                1510
         3 ACTIVE                 1509
         2 ACTIVE                 1508
    GROUP# MEMBER
---------- --------------------------------------------------
         3 F:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG
         2 F:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02.LOG
         1 F:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
SQL> recover database using backup controlfile until cancel;
ORA-00279: 更改 16574746 (在 03/01/2014 13:10:11 生成) 对于线程 1 是必需的
ORA-00289: 建议: F:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2014_0
3_01\O1_MF_1_1510_%U_.ARC
ORA-00280: 更改 16574746 (用于线程 1) 在序列 #1510 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
F:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
已应用的日志。
完成介质恢复。
SQL> alter database open resetlogs;
数据库已更改。

在最近的同样的错误,但是没有如此的幸运具体参考:记录一次ORA-00600 [kcrf_resilver_log_1] 恢复过程

因asm sga_target设置不当导致11gr2 rac无法正常启动

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:因asm sga_target设置不当导致11gr2 rac无法正常启动

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

2014年第一个故障排查和解决:同事反馈给我说solaris 11.2 两节点rac无法启动,让我帮忙看下。通过分析是因为sga_target参数设置不合理导致asm无法正常启动
GI无法正常启动

grid@zwq-rpt1:~$crsctl status resource -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.
grid@zwq-rpt1:~$crsctl status resource -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  OFFLINE                               Instance Shutdown
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       zwq-rpt1
ora.crf
      1        ONLINE  ONLINE       zwq-rpt1
ora.crsd
      1        ONLINE  OFFLINE
ora.cssd
      1        ONLINE  ONLINE       zwq-rpt1
ora.cssdmonitor
      1        ONLINE  ONLINE       zwq-rpt1
ora.ctssd
      1        ONLINE  ONLINE       zwq-rpt1                 ACTIVE:0
ora.diskmon
      1        OFFLINE OFFLINE
ora.evmd
      1        ONLINE  INTERMEDIATE zwq-rpt1
ora.gipcd
      1        ONLINE  ONLINE       zwq-rpt1
ora.gpnpd
      1        ONLINE  ONLINE       zwq-rpt1
ora.mdnsd
      1        ONLINE  ONLINE       zwq-rpt1

asm未正常启动

GI日志报错

2014-01-01 00:40:47.708
[cssd(1418)]CRS-1605:CSSD voting file is online: /dev/rdsk/emcpower0a; details in /export/home/app/grid/log/zwq-rpt1/cssd/ocssd.log.
2014-01-01 00:40:53.234
[cssd(1418)]CRS-1601:CSSD Reconfiguration complete. Active nodes are zwq-rpt1 zwq-rpt2 .
2014-01-01 00:40:56.659
[ctssd(1483)]CRS-2407:The new Cluster Time Synchronization Service reference node is host zwq-rpt2.
2014-01-01 00:40:56.661
[ctssd(1483)]CRS-2401:The Cluster Time Synchronization Service started on host zwq-rpt1.
2014-01-01 00:41:02.016
[ctssd(1483)]CRS-2408:The clock on host zwq-rpt1 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.
2014-01-01 00:43:23.874
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 00:45:42.837
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 00:48:02.087
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 00:48:18.836
[ohasd(1083)]CRS-2807:Resource 'ora.asm' failed to start automatically.
2014-01-01 00:48:18.837
[ohasd(1083)]CRS-2807:Resource 'ora.crsd' failed to start automatically.
2014-01-01 01:05:15.396
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [CRSDG], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 01:05:45.101
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [CRSDG], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 01:06:15.104
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [CRSDG], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".

这里较为明显的看到,因为asm磁盘组异常导致ocr无法被访问导致crs无法正常启动

ORAAGENT日志

2014-01-01 00:43:23.870: [ora.asm][9] {0:0:2} [start] InstConnection::connectInt (2) Exception OCIException
2014-01-01 00:43:23.870: [ora.asm][9] {0:0:2} [start] InstConnection:connect:excp OCIException OCI error 604
2014-01-01 00:43:23.870: [ora.asm][9] {0:0:2} [start] DgpAgent::queryDgStatus excp ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^34f764db","kglHeapInitialize:temp")

报了较为清晰的ORA-4031错误,检查asm日志

ASM日志报错

Wed Jan 01 00:47:33 2014
ORACLE_BASE not set in environment. It is recommended
that ORACLE_BASE be set in the environment
Reusing ORACLE_BASE from an earlier startup = /export/home/app/oracle
Wed Jan 01 00:47:39 2014
Errors in file /export/home/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_1728.trc  (incident=291447):
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^34f764db","kglHeapInitialize:temp")
Incident details in: /export/home/app/oracle/diag/asm/+asm/+ASM1/incident/incdir_291447/+ASM1_ora_1728_i291447.trc
Wed Jan 01 00:47:48 2014
Dumping diagnostic data in directory=[cdmp_20140101004748], requested by (instance=1, osid=1728), summary=[incident=291447].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jan 01 00:47:53 2014
Errors in file /export/home/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_1730.trc  (incident=291448):
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^34f764db","kglHeapInitialize:temp")
Incident details in: /export/home/app/oracle/diag/asm/+asm/+ASM1/incident/incdir_291448/+ASM1_ora_1730_i291448.trc
Wed Jan 01 00:48:01 2014
Dumping diagnostic data in directory=[cdmp_20140101004801], requested by (instance=1, osid=1730), summary=[incident=291448].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jan 01 00:48:07 2014
Errors in file /export/home/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_1732.trc  (incident=291449):
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^34f764db","kglHeapInitialize:temp")
Incident details in: /export/home/app/oracle/diag/asm/+asm/+ASM1/incident/incdir_291449/+ASM1_ora_1732_i291449.trc
Wed Jan 01 00:48:16 2014
Dumping diagnostic data in directory=[cdmp_20140101004816], requested by (instance=1, osid=1732), summary=[incident=291449].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jan 01 00:48:16 2014
License high water mark = 1
USER (ospid: 1736): terminating the instance
Instance terminated by USER, pid = 1736

这里可以清晰的看到,因为shared pool不足,导致asm报ora-4031错误,从而使得asm无法正常启动

分析原因

Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options.
ORACLE_HOME = /export/home/app/grid
System name:	SunOS
Node name:	zwq-rpt1
Release:	5.11
Version:	11.1
Machine:	sun4v
Using parameter settings in server-side spfile +CRSDG/zwq-rpt-cluster/asmparameterfile/registry.253.823992831
System parameters with non-default values:
  sga_max_size             = 2G
  large_pool_size          = 16M
  instance_type            = "asm"
  sga_target               = 0
  remote_login_passwordfile= "EXCLUSIVE"
  asm_diskstring           = "/dev/rdsk/*"
  asm_diskgroups           = "FRADG"
  asm_diskgroups           = "DATADG"
  asm_power_limit          = 1
  diagnostic_dest          = "/export/home/app/oracle"

这里可以看到sga_target被设置为了0,而shared pool又未被配置,这里因为shared pool不足从而出现了ORA-4031,从而导致crs在启动asm的过程失败,从而使得ocr不能被访问,进而使得crs不能正常启动.

处理方法
1.编辑pfile

grid@zwq-rpt1:/export/home/app/oracle/diag/asm/+asm/+ASM1/trace$vi /tmp/asm.pfile
  memory_target = 2G
  large_pool_size          = 16M
  instance_type            = "asm"
  sga_target               = 0
  remote_login_passwordfile= "EXCLUSIVE"
  asm_diskstring           = "/dev/rdsk/*"
  asm_diskgroups           = "FRADG"
  asm_diskgroups           = "DATADG"
  asm_power_limit          = 1
  diagnostic_dest          = "/export/home/app/oracle"

2.启动asm

grid@zwq-rpt1:/export/home/app/oracle/diag/asm/+asm/+ASM1/trace$sqlplus / as sysasm
SQL*Plus: Release 11.2.0.3.0 Production on Wed Jan 1 01:04:10 2014
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup pfile='/tmp/asm.pfile'
ASM instance started
Total System Global Area 2138521600 bytes
Fixed Size                  2161024 bytes
Variable Size            2102806144 bytes
ASM Cache                  33554432 bytes
ASM diskgroups mounted

3. 创建spfile

SQL> create spfile='+CRSDG' FROM PFILE='/tmp/asm.pfile';
File created.
--asm alert日志
Wed Jan 01 01:08:59 2014
NOTE: updated gpnp profile ASM SPFILE to
NOTE: updated gpnp profile ASM diskstring: /dev/rdsk/*
NOTE: updated gpnp profile ASM diskstring: /dev/rdsk/*
NOTE: updated gpnp profile ASM SPFILE to +CRSDG/zwq-rpt-cluster/asmparameterfile/registry.253.835664939

4. 关闭asm

SQL> shutdown immediate
ORA-15097: cannot SHUTDOWN ASM instance with connected client (process 1971)
SQL> shutdown abort
ASM instance shutdown

5. 重启crs

root@zwq-rpt1:~# crsctl stop crs -f
root@zwq-rpt1:~# crsctl start crs

6. 重启其他节点crs

root@zwq-rpt2:~# crsctl stop crs -f
root@zwq-rpt2:~# crsctl start crs

7. 检查结果

root@zwq-rpt1:~# crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRSDG.dg
               ONLINE  ONLINE       zwq-rpt1
               ONLINE  ONLINE       zwq-rpt2
ora.DATADG.dg
               ONLINE  ONLINE       zwq-rpt1
               ONLINE  ONLINE       zwq-rpt2
ora.FRADG.dg
               ONLINE  ONLINE       zwq-rpt1
               ONLINE  ONLINE       zwq-rpt2
ora.LISTENER.lsnr
               ONLINE  ONLINE       zwq-rpt1
               ONLINE  ONLINE       zwq-rpt2
ora.asm
               ONLINE  ONLINE       zwq-rpt1                 Started
               ONLINE  ONLINE       zwq-rpt2                 Started
ora.gsd
               OFFLINE OFFLINE      zwq-rpt1
               OFFLINE OFFLINE      zwq-rpt2
ora.net1.network
               ONLINE  ONLINE       zwq-rpt1
               ONLINE  ONLINE       zwq-rpt2
ora.ons
               ONLINE  ONLINE       zwq-rpt1
               ONLINE  ONLINE       zwq-rpt2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       zwq-rpt1
ora.cvu
      1        ONLINE  ONLINE       zwq-rpt1
ora.oc4j
      1        ONLINE  ONLINE       zwq-rpt1
ora.rptdb.db
      1        ONLINE  ONLINE       zwq-rpt1                 Open
      2        ONLINE  ONLINE       zwq-rpt2                 Open
ora.scan1.vip
      1        ONLINE  ONLINE       zwq-rpt1
ora.zwq-rpt1.vip
      1        ONLINE  ONLINE       zwq-rpt1
ora.zwq-rpt2.vip
      1        ONLINE  ONLINE       zwq-rpt2

至此恢复正常,2014年第一个故障顺利解决

因RAC的undo_management参数不一致导致数据库mount报ORA-01105 ORA-01606

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:因RAC的undo_management参数不一致导致数据库mount报ORA-01105 ORA-01606

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

环境Linux 5.8 10.2.0.5 RAC,两个节点只能一个节点mount,如果尝试mount另外节点就报ORA-01105和ORA-01606错误
数据库版本

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production

crs资源情况

[oracle@node1 dbs]$ $ORA_CRS_HOME/bin/crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora....D1.inst application    OFFLINE   OFFLINE
ora....D2.inst application    ONLINE    ONLINE    node2
ora.PROD.db    application    ONLINE    ONLINE    node2
ora....SM1.asm application    ONLINE    ONLINE    node1
ora....E1.lsnr application    ONLINE    ONLINE    node1
ora.node1.gsd  application    ONLINE    ONLINE    node1
ora.node1.ons  application    ONLINE    ONLINE    node1
ora.node1.vip  application    ONLINE    ONLINE    node1
ora....SM2.asm application    ONLINE    ONLINE    node2
ora....E2.lsnr application    ONLINE    ONLINE    node2
ora.node2.gsd  application    ONLINE    ONLINE    node2
ora.node2.ons  application    ONLINE    ONLINE    node2
ora.node2.vip  application    ONLINE    ONLINE    node2

节点1 mount报错

SQL> startup
ORACLE instance started.
Total System Global Area  171966464 bytes
Fixed Size                  2094832 bytes
Variable Size             113248528 bytes
Database Buffers           50331648 bytes
Redo Buffers                6291456 bytes
ORA-01105: mount is incompatible with mounts by other instances
ORA-01606: gc_files_to_locks not identical to that of another mounted instance
Error:    ORA 1105
Text:     mount is incompatible with mounts by other instances
-------------------------------------------------------------------------------
Cause:  An attempt was made to mount the database, but another instance has already mounted
        a database by the same name, and the mounts are not compatible.
        dditional messages will accompany this message to report why the mounts are incompatible.
Action:  See the accompanying messages for the appropriate action to take.
Error:  ORA 1606
Text:   GC_FILES_TO_LOCKS not identical to that of another mounted instance
-------------------------------------------------------------------------------
Cause:  The initialization parameter GC_FILES_TO_LOCKS is not the same as
        another instance mounted in parallel mode.
        This parameter must be the same as that for all shared instances.
Action: Modify the parameter to be compatible with the other instances, then
        shut down and restart the instance.

根据这个错误提示,查询两个节点的gc_files_to_locks参数,均为空值(默认值),也就是值相同

SQL> show parameter gc_files_to_locks;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
gc_files_to_locks                    string

检查两个节点的gc相关隐含参数,发现所有值也均一致

NAME                           DESCRIPTION                                                           VALUE
------------------------------ --------------------------------------------------------------------- -------
_gc_affinity_limit             dynamic affinity limit                                                50
_gc_affinity_minimum           dynamic affinity minimum activity per minute                          6000
_gc_affinity_time              if non zero, enable dynamic object affinity                           10
_gc_async_memcpy               if TRUE, use async memcpy                                             FALSE
_gc_check_bscn                 if TRUE, check for stale blocks                                       TRUE
_gc_coalesce_recovery_reads    if TRUE, coalesce recovery reads                                      TRUE
_gc_defer_time                 how long to defer down converts for hot buffers                       3
_gc_dissolve_undo_affinity     if TRUE, dissolve undo affinity after an offline                      FALSE
_gc_dynamic_affinity_locks     if TRUE, get dynamic affinity locks                                   TRUE
_gc_element_percent            global cache element percent                                          103
_gc_global_lru                 turn global lru off, make it automatic, or turn it on                 AUTO
_gc_initiate_undo_affinity     if TRUE, initiate undo affinity after an online                       TRUE
_gc_integrity_checks           set the integrity check level                                         1
_gc_keep_recovery_buffers      if TRUE, make recovery buffers current                                TRUE
_gc_latches                    number of latches per LMS process                                     8
_gc_maximum_bids               maximum number of bids which can be prepared                          0
_gcs_fast_reconfig             if TRUE, enable fast reconfiguration for gcs locks                    TRUE
_gcs_latches                   number of gcs resource hash latches to be allocated per LMS process   64
_gcs_pkey_history              number of pkey remastering history                                    4000
_gcs_process_in_recovery       if TRUE, process gcs requests during instance recovery                TRUE
_gcs_resources                 number of gcs resources to be allocated
_gcs_shadow_locks              number of pcm shadow locks to be allocated
_gc_statistics                 if TRUE, kcl statistics are maintained                                TRUE
_gcs_testing                   GCS testing parameter                                                 0
_gc_tsn_undo_affinity          if TRUE, use TSN undo affinity                                        TRUE
_gc_undo_affinity              if TRUE, enable dynamic undo affinity                                 TRUE
_gc_undo_affinity_locks        if TRUE, get affinity locks for undo                                  TRUE
_gc_use_cr                     if TRUE, allow CR pins on PI and WRITING buffers                      TRUE
_gc_vector_read                if TRUE, vector read current buffers                                  TRUE

仔细对比数据库参数,发现undo异常

--节点1
SQL>  show parameter undo
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_management                      string      MANUAL
undo_retention                       integer     900
undo_tablespace                      string      SYSTEM
--节点2
SQL>  show parameter undo
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_management                      string      AUTO
undo_retention                       integer     900
undo_tablespace                      string      UNDOTBS1

这里已经明确,因为两个节点的undo_*相关参数配置不正确,导致数据库只能一个节点mount。进一步定位问题发现,原来是因为dba粗心在编辑节点1的参数文件的时候把undo_*相关的参数给弄丢了,从而数据库使用了默认值undo_management=manual,undo_tablespace=system