分享一例由于主库逻辑坏块导致dataguard容灾失效

有朋友数据库配置了物理dataguard,但是由于触发了某种Oracle bug/或者其他原因导致block出现大量逻辑坏块,结果是主备库(主库启动后就crash,备库failover后也启动后就crash)都无法正常工作,请求给予技术支持。
数据库配置了物理dataguard,但是主库在进行了某些操作之后,主库直接crash.重启主库发现,数据库启动之后,稍后数据库继续CRASH

Wed Jul 08 16:32:22 2015
Thread 1 advanced to log sequence 401531 (LGWR switch)
  Current log# 6 seq# 401531 mem# 0: /opt/oracle/database/fast_recovery_area/xifenfei/onlinelog/o1_mf_6_b9p7qhrm_.log
Archived Log entry 605194 added for thread 1 sequence 401530 ID 0xfbe74bc9 dest 1:
ARC3: Standby redo logfile selected for thread 1 sequence 401530 for destination LOG_ARCHIVE_DEST_2
******************************************************************
LGWR: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2
******************************************************************
Wed Jul 08 16:33:02 2015
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p032_413611.trc  (incident=201910):
ORA-00600: internal error code, arguments: [17114], [0x7FB03BB6DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FB03BB6DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/incident/incdir_201910/xifenfei_p032_413611_i201910.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p032_413611.trc  (incident=201911):
ORA-00600: internal error code, arguments: [kghfrempty:ds], [0x7FB03BB6DBE8], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17114], [0x7FB03BB6DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FB03BB6DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/incident/incdir_201911/xifenfei_p032_413611_i201911.trc
Wed Jul 08 16:34:23 2015
SMON: slave died unexpectedly, downgrading to serial recovery
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_413207.trc  (incident=201547):
ORA-00600: internal error code, arguments: [17182], [0x7F24D680D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/incident/incdir_201547/xifenfei_smon_413207_i201547.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Block recovery from logseq 401530, block 119 to scn 73226510040
Recovery of Online Redo Log: Thread 1 Group 5 Seq 401530 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/xifenfei/onlinelog/o1_mf_5_b9p7qhl0_.log
Recovery of Online Redo Log: Thread 1 Group 6 Seq 401531 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/xifenfei/onlinelog/o1_mf_6_b9p7qhrm_.log
Block recovery completed at rba 401531.1882.16, scn 17.212066009
ORACLE Instance xifenfei (pid = 16) - Error 600 encountered while recovering transaction (13, 14) on object 135520.
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_413207.trc:
ORA-00600: internal error code, arguments: [17182], [0x7F24D680D7A0], [], [], [], [], [], [], [], [], [], []
Wed Jul 08 16:34:24 2015
Dumping diagnostic data in directory=[cdmp_20150708163424], requested by (instance=1, osid=413207 (SMON)), summary=[incident=201547].
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_413207.trc  (incident=201548):
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7F24D680D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7F24D680D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/incident/incdir_201548/xifenfei_smon_413207_i201548.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_413207.trc:
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7F24D680D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7F24D680D7A0], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 413207): terminating the instance due to error 474
System state dump requested by (instance=1, osid=413207 (SMON)), summary=[abnormal instance termination].
System State dumped to trace file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_diag_413167_20150708163426.trc
Dumping diagnostic data in directory=[cdmp_20150708163426], requested by (instance=1, osid=413207 (SMON)), summary=[abnormal instance termination].
Instance terminated by SMON, pid = 413207

由于主库不能正常open,备库直接failover方式激活

Wed Jul 08 17:56:41 2015
alter database recover managed standby database finish
Terminal Recovery: request posted (xffdb)
Wed Jul 08 17:56:45 2015
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
Terminal Recovery timestamp is '07/08/2015 17:56:45'
Terminal Recovery: applying standby redo logs.
Terminal Recovery: thread 1 seq# 401533 redo required
Terminal Recovery:
Recovery of Online Redo Log: Thread 1 Group 7 Seq 401533 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_7_bb7x31lk_.log
Identified End-Of-Redo (failover) for thread 1 sequence 401533 at SCN 0xffff.ffffffff
Incomplete Recovery applied until change 73226530800 time 07/08/2015 16:56:40
Terminal Recovery: successful completion
Wed Jul 08 17:56:45 2015
ARCH: Archival stopped, error occurred. Will continue retrying
Forcing ARSCN to IRSCN for TR 17:212086768
ORACLE Instance xffdb - Archival Error
Attempt to set limbo arscn 17:212086768 irscn 17:212086768 ORA-16014: log 7 sequence# 401533 not archived, no available destinations
ORA-00312: online log 7 thread 1: '/opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_7_bb7x31lk_.log'
Resetting standby activation ID 4226239433 (0xfbe74bc9)
Wed Jul 08 17:56:45 2015
MRP0: Media Recovery Complete (xffdb)
MRP0: Background Media Recovery process shutdown (xffdb)
Terminal Recovery: completion detected (xffdb)
Completed: alter database recover managed standby database finish
Wed Jul 08 17:56:58 2015
alter database commit to switchover to primary
ALTER DATABASE SWITCHOVER TO PRIMARY (xffdb)
Maximum wait for role transition is 15 minutes.
Backup controlfile written to trace file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_ora_485094.trc
Standby terminal recovery start SCN: 73226530482
RESETLOGS after incomplete recovery UNTIL CHANGE 73226530800
Online logfile pre-clearing operation disabled by switchover
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_1_bb7x30gw_.log: Thread 1 Group 1 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_2_bb7x30js_.log: Thread 1 Group 2 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_3_bb7x310q_.log: Thread 1 Group 3 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_4_bb7x312r_.log: Thread 1 Group 4 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_5_bb7x317f_.log: Thread 1 Group 5 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_6_bb7x31cz_.log: Thread 1 Group 6 was previously cleared
Standby became primary SCN: 73226530481
Wed Jul 08 17:56:58 2015
Setting recovery target incarnation to 3
Switchover: Complete - Database mounted as primary
Completed: alter database commit to switchover to primary

数据库激活成功后,重启激活之后数据库发现和主库出现类似情况

Wed Jul 08 17:57:25 2015
ALTER DATABASE   MOUNT
Successful mount of redo thread 1, with mount id 4243462021
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Wed Jul 08 17:57:30 2015
ALTER DATABASE OPEN
Assigning activation ID 4243462021 (0xfcee1785)
LGWR: STARTING ARCH PROCESSES
Wed Jul 08 17:57:30 2015
ARC0 started with pid=23, OS id=485230
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Thread 1 advanced to log sequence 2 (thread open)
Wed Jul 08 17:57:31 2015
ARC1 started with pid=24, OS id=485236
Wed Jul 08 17:57:31 2015
ARC2 started with pid=25, OS id=485240
Wed Jul 08 17:57:31 2015
ARC3 started with pid=26, OS id=485244
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Thread 1 opened at log sequence 2
  Current log# 2 seq# 2 mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_2_bb7x30js_.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Wed Jul 08 17:57:31 2015
NSA2 started with pid=27, OS id=485248
[485226] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:3453451748 end:3453452018 diff:270 (2 seconds)
Dictionary check beginning
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Wed Jul 08 17:57:32 2015
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_p032_485380.trc  (incident=64481):
ORA-00600: internal error code, arguments: [17182], [0x7FE96B50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64481/xffdb_p032_485380_i64481.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Starting background process QMNC
Wed Jul 08 17:57:32 2015
QMNC started with pid=92, OS id=485512
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Dumping diagnostic data in directory=[cdmp_20150708175733], requested by (instance=1, osid=485380 (P032)), summary=[incident=64481].
Thread 1 advanced to log sequence 3 (LGWR switch)
  Current log# 3 seq# 3 mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_3_bb7x310q_.log
ARC3: STARTING ARCH PROCESSES
Wed Jul 08 17:57:34 2015
ARC4 started with pid=93, OS id=485516
Wed Jul 08 17:57:35 2015
db_recovery_file_dest_size of 204800 MB is 0.41% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
ARC4: Archival started
ARC3: STARTING ARCH PROCESSES COMPLETE
krsk_srl_archive_int: Enabling archival of deferred physical standby SRLs
Archived Log entry 273963 added for thread 1 sequence 2 ID 0xfcee1785 dest 1:
Archived Log entry 273964 added for thread 1 sequence 401533 ID 0xfbe74bc9 dest 1:
Completed: ALTER DATABASE OPEN
Wed Jul 08 17:57:36 2015
Starting background process CJQ0
Wed Jul 08 17:57:36 2015
CJQ0 started with pid=95, OS id=485554
Shutting down archive processes
ARCH shutting down
ARC4: Archival stopped
Wed Jul 08 17:57:41 2015
Thread 1 advanced to log sequence 4 (LGWR switch)
  Current log# 4 seq# 4 mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_4_bb7x312r_.log
Wed Jul 08 17:57:41 2015
Archived Log entry 273965 added for thread 1 sequence 3 ID 0xfcee1785 dest 1:
Wed Jul 08 17:58:30 2015
Sweep [inc][64481]: completed
Sweep [inc2][64481]: completed
Wed Jul 08 17:58:31 2015
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_p032_485380.trc  (incident=64482):
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FE96B50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64482/xffdb_p032_485380_i64482.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jul 08 17:58:32 2015
Dumping diagnostic data in directory=[cdmp_20150708175832], requested by (instance=1, osid=485380 (P032)), summary=[incident=64482].
Wed Jul 08 17:58:36 2015
Thread 1 advanced to log sequence 5 (LGWR switch)
  Current log# 5 seq# 5 mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_5_bb7x317f_.log
Wed Jul 08 17:59:02 2015
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_p032_485380.trc  (incident=64483):
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FE96B50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64483/xffdb_p032_485380_i64483.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jul 08 17:59:04 2015
Dumping diagnostic data in directory=[cdmp_20150708175904], requested by (instance=1, osid=485380 (P032)), summary=[incident=64483].
Wed Jul 08 17:59:29 2015
Sweep [inc][64483]: completed
Sweep [inc][64482]: completed
Sweep [inc2][64483]: completed
Sweep [inc2][64482]: completed
Wed Jul 08 17:59:30 2015
Block recovery from logseq 2, block 104 to scn 73226531646
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_2_bb7x30js_.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_3_bb7x310q_.log
Recovery of Online Redo Log: Thread 1 Group 4 Seq 4 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_4_bb7x312r_.log
Recovery of Online Redo Log: Thread 1 Group 5 Seq 5 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_5_bb7x317f_.log
Block recovery stopped at EOT rba 5.765.16
Block recovery completed at rba 5.765.16, scn 17.212087614
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_p032_485380.trc  (incident=64484):
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7FE96B50D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FE96B50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64484/xffdb_p032_485380_i64484.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Dumping diagnostic data in directory=[cdmp_20150708175934], requested by (instance=1, osid=485380 (P032)), summary=[incident=64487].
Wed Jul 08 17:59:36 2015
SMON: slave died unexpectedly, downgrading to serial recovery
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_smon_485190.trc  (incident=64129):
ORA-00600: internal error code, arguments: [17182], [0x7F5AED10D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64129/xffdb_smon_485190_i64129.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Block recovery from logseq 2, block 104 to scn 73226531646
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_2_bb7x30js_.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_3_bb7x310q_.log
Recovery of Online Redo Log: Thread 1 Group 4 Seq 4 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_4_bb7x312r_.log
Recovery of Online Redo Log: Thread 1 Group 5 Seq 5 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_5_bb7x317f_.log
Block recovery completed at rba 5.765.16, scn 17.212087615
ORACLE Instance xffdb (pid = 16) - Error 600 encountered while recovering transaction (13, 14) on object 135520.
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_smon_485190.trc:
ORA-00600: internal error code, arguments: [17182], [0x7F5AED10D7A0], [], [], [], [], [], [], [], [], [], []
Dumping diagnostic data in directory=[cdmp_20150708175937], requested by (instance=1, osid=485190 (SMON)), summary=[incident=64129].
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_smon_485190.trc  (incident=64130):
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7F5AED10D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7F5AED10D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64130/xffdb_smon_485190_i64130.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_smon_485190.trc:
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7F5AED10D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7F5AED10D7A0], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 485190): terminating the instance due to error 474
System state dump requested by (instance=1, osid=485190 (SMON)), summary=[abnormal instance termination].
System State dumped to trace file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_diag_485150_20150708175939.trc
Dumping diagnostic data in directory=[cdmp_20150708175939], requested by (instance=1, osid=485190 (SMON)), summary=[abnormal instance termination].
Instance terminated by SMON, pid = 485190

通过进一步分析确定是67号文件有异常,使用dbv检查该文件发现

DBVERIFY: Release 11.2.0.4.0 - Production on Thu Jul 9 11:08:27 2015
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /opt/oracle/database/oradata/xffdb/dk_adv_ts_main_25.dbf
Block Checking: DBA = 283035270, Block Type = KTB-managed data block
data header at 0x7f87813ea064
kdbchk: the amount of space used is not equal to block size
        used=7383 fsc=0 avsp=873 dtl=8088
Page 2016902 failed with check code 6110
Block Checking: DBA = 283035271, Block Type = KTB-managed data block
data header at 0x7f87813ec064
kdbchk: the amount of space used is not equal to block size
        used=8676 fsc=0 avsp=832 dtl=8088
Page 2016903 failed with check code 6110
Block Checking: DBA = 283035272, Block Type = KTB-managed data block
data header at 0x7f87813ee064
kdbchk: avsp(1306) > tosp(894)
Page 2016904 failed with check code 6128
Block Checking: DBA = 283035273, Block Type = KTB-managed data block
data header at 0x7f87813f0064
kdbchk: the amount of space used is not equal to block size
        used=7506 fsc=0 avsp=815 dtl=8088
Page 2016905 failed with check code 6110
Block Checking: DBA = 283035274, Block Type = KTB-managed data block
data header at 0x7f87813f2064
kdbchk: the amount of space used is not equal to block size
        used=7892 fsc=0 avsp=884 dtl=8088
Page 2016906 failed with check code 6110
…………
Block Checking: DBA = 283035368, Block Type = KTB-managed data block
data header at 0x7f87814ae064
kdbchk: the amount of space used is not equal to block size
        used=7934 fsc=0 avsp=837 dtl=8088
Page 2017000 failed with check code 6110
Block Checking: DBA = 283035369, Block Type = KTB-managed data block
data header at 0x7f87814b0064
kdbchk: the amount of space used is not equal to block size
        used=7683 fsc=0 avsp=883 dtl=8088
Page 2017001 failed with check code 6110
Block Checking: DBA = 283035370, Block Type = KTB-managed data block
data header at 0x7f87814b2064
kdbchk: the amount of space used is not equal to block size
        used=8556 fsc=0 avsp=841 dtl=8088
Page 2017002 failed with check code 6110
Block Checking: DBA = 283035371, Block Type = KTB-managed data block
data header at 0x7f87814b4064
kdbchk: the amount of space used is not equal to block size
        used=7460 fsc=0 avsp=822 dtl=8088
Page 2017003 failed with check code 6110
DBVERIFY - Verification complete
Total Pages Examined         : 4063232
Total Pages Processed (Data) : 2724435
Total Pages Failing   (Data) : 102
Total Pages Processed (Index): 1064839
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 273957
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 212065913 (17.212065913)

把datafile 67给offline之后,数据库open正常,也不再crash。这里比较明显,导致主库和被动都出现异常,主库直接crash,然后重启后,一会儿就crash;备库激活后,重启后一会儿也就crash;两个库现象相同。最后通过dbv定位到是由于某个文件出现大量逻辑坏块,导致数据库open之后,进行回滚之时crash.对于此类故障,可以通过屏蔽事务回滚,并且通过alert日志和trace文件定位到异常对象,可以删除异常对象可以暂时解决该问题。而导致数据库出现类似问题。
通过这个案例,可以的出来,由于oracle某种bug或者其他原因,导致block 逻辑损坏,而且这个损坏会传输到备库,导致备库也异常,oracle的备份,容灾不能全部依赖物理dataguard容灾。因此在条件允许情况下,建议增加物理备份和逻辑容灾(类似OGG)

在数据库open过程中常遇到ORA-01555汇总

在数据库open的过程中,select ctime, mtime, stime from obj$ where obj# = :1语句报ORA-01555错误,数据库无法正常open
一般情况下会报某个回滚段,但是这里ORA-01555: snapshot too old: rollback segment number 0 with name “SYSTEM” too small这里直接报了system(系统回滚段),属于少见情况

Fri Jun 26 11:47:31 2015
SMON: enabling cache recovery
Fri Jun 26 11:47:31 2015
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0b41.37629378):
Fri Jun 26 11:47:31 2015
select ctime, mtime, stime from obj$ where obj# = :1
Fri Jun 26 11:47:31 2015
Errors in file /orabin/admin/doocrm/udump/doocrm_ora_5046722.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 0 with name "SYSTEM" too small
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 5046722
ORA-1092 signalled during: alter database open resetlogs...

在数据库open的过程中,select increment$,minvalue,maxvalue,cycle#,order$,cache,highwater,audit$,flags from seq$ where obj#=:1语句上报ORA-01555,导致数据库open失败
ORA-01555


在数据库open过程中,select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1语句导致数据库open失败.

ARC0: Becoming the 'no SRL' ARCH
Sun Jun 28 16:08:22 2015
ARC1: Becoming the heartbeat ARCH
Sun Jun 28 16:08:22 2015
SMON: enabling cache recovery
Sun Jun 28 16:08:22 2015
ORA-01555 caused by SQL statement below (SQL ID: 7bd391hat42zk, Query Duration=0 sec, SCN: 0x0d27.0a1ce29d):
Sun Jun 28 16:08:22 2015
select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,
   DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1
Sun Jun 28 16:08:22 2015
Errors in file /oracle/app/oracle/admin/ibsscrm/udump/xxxx_ora_30212428.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 0 with name "SYSTEM" too small
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 30212428
ORA-1092 signalled during: alter database open...

出现这类问题,一般是由于obj$,seq$,undo$等基表上对象scn大于数据库当前scn或者是由于这些表上有事务未提交,出现上述两种情况,数据库需要找对应的undo的回滚段中记录,而此时对应的回滚段异常(或者是由于redo未进行正常前滚,导致上述对象或者回滚段记录不正常),从而出现类似情况,一般出现此类情况,可以通过10046定位到block,然后故障原因采用bbed修改scn或者bbed提交事务来解决此类问题.
最近两年的恢复中又遇到一些ora-00704 ora-00604 ora-01555的错误,导致数据库无法正常open,对其进行补充,具体参见:数据库open过程遭遇ORA-1555对应sql语句补充

误修改/u01权限/所有者的故障恢复

有朋友找到我,说他对生产库做了误操作,导致数据库异常,请我帮忙处理,对/u01目录修改了用户和权限,导致数据库无法登录,但是业务还在继续
误操作命令

mkdir -p /u01/app/grid
mkdir -p /u01/app/11.2.0/grid
chown -R grid:oinstall /u01
mkdir -p /u01/app/oracle
chown oracle:oinstall /u01/app/oracle
chmod -R 775 /u01

尝试sqlplus登录数据库报ORA-12547

[oracle@www.xifenfei.com admin]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Jul 1 17:40:42 2015
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
ERROR:
ORA-12547: TNS:lost contact
Enter user-name:
ERROR:
ORA-12547: TNS:lost contact
Enter user-name:
ERROR:
ORA-12547: TNS:lost contact
SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus

数据库alert日志报错

Wed Jul 01 18:03:22 2015
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_28977.trc  (incident=129553):
ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [1100], [1101], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/incident/incdir_129553/xifenfei_ora_28977_i129553.trc
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/incident/incdir_129553/xifenfei_ora_28977_i129553.trc:
ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [1100], [1101], [], [], [], [], [], [], [], [], []
Wed Jul 01 18:03:22 2015
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_28979.trc  (incident=129561):
ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [1100], [1101], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/incident/incdir_129561/xifenfei_ora_28979_i129561.trc
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/incident/incdir_129561/xifenfei_ora_28979_i129561.trc:
ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [1100], [1101], [], [], [], [], [], [], [], [], []
Wed Jul 01 18:03:22 2015
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_28985.trc  (incident=129569):
ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [1100], [1101], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/incident/incdir_129569/xifenfei_ora_28985_i129569.trc
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/incident/incdir_129569/xifenfei_ora_28985_i129569.trc:
ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [1100], [1101], [], [], [], [], [], [], [], [], []
Wed Jul 01 18:03:22 2015
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_28983.trc  (incident=129577):
ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [1100], [1101], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/incident/incdir_129577/xifenfei_ora_28983_i129577.trc
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/incident/incdir_129577/xifenfei_ora_28983_i129577.trc:
ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [1100], [1101], [], [], [], [], [], [], [], [], []

trace文件信息

Dump file /u01/app/oracle/diag/rdbms/hybris01/hybris01/incident/incdir_129577/hybris01_ora_28983_i129577.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1
System name:	Linux
Node name:	dpppdridbo00
Release:	2.6.32-431.17.1.el6.x86_64
Version:	#1 SMP Wed May 7 23:32:49 UTC 2014
Machine:	x86_64
Instance name: hybris01
Redo thread mounted by this instance: 0 <none>
Oracle process number: 0
Unix process pid: 28983, image: oracle@dpppdridbo00
*** 2015-07-01 18:03:22.296
Dump continued from file: /u01/app/oracle/diag/rdbms/hybris01/hybris01/trace/hybris01_ora_28983.trc
ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [1100], [1101], [], [], [], [], [], [], [], [], []
========= Dump for incident 129577 (ORA 600 [spstp: ORACLE_HOME uid does not match euid]) ========
*** 2015-07-01 18:03:22.297
dbkedDefDump(): Starting incident default dumps (flags=0x0, level=3, mask=0x0)
----- Error Stack Dump -----
ORA-00600: internal error code, arguments: [spstp: ORACLE_HOME uid does not match euid], [1100], [1101], [], [], [], [], [], [], [], [], []
----- SQL Statement (None) -----
Current SQL information unavailable - no SGA.

相关用户名相关信息

[oracle@dpppdridbo00 incdir_129577]$ id grid
uid=1100(grid) gid=1000(oinstall) groups=1000(oinstall),1100(asmadmin),1300(asmdba),1301(asmoper)
[oracle@dpppdridbo00 incdir_129577]$ id oracle
uid=1101(oracle) gid=1000(oinstall) groups=1000(oinstall),1200(dba),1300(asmdba)

这里比较明显,数据库的oracle_home,的所有者id为1101,但是被修改为了1100

查看相关目录文件权限

[oracle@www.xifenfei.com ~]$ env|grep ORA
ORACLE_SID=xifenfei
ORACLE_BASE=/u01/app/oracle
ORACLE_TERM=xterm
ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1
[oracle@www.xifenfei.com ~]$ cd /u01/app/oracle/product/11.2.0/db_1
[oracle@www.xifenfei.com db_1]$ ls -ltr
total 308
-rwxrwxr-x.  1 grid oinstall    63 Mar  1 16:39 oraInst.loc
drwxrwxr-x.  8 grid oinstall  4096 Mar  1 16:40 assistants
drwxrwxr-x.  6 grid oinstall  4096 Mar  1 16:40 crs
drwxrwxr-x.  3 grid oinstall  4096 Mar  1 16:40 csmig
drwxrwxr-x.  7 grid oinstall  4096 Mar  1 16:40 cv
drwxrwxr-x.  3 grid oinstall  4096 Mar  1 16:40 demo
drwxrwxr-x.  3 grid oinstall  4096 Mar  1 16:40 diagnostics
drwxrwxr-x.  3 grid oinstall  4096 Mar  1 16:40 has
drwxrwxr-x.  3 grid oinstall  4096 Mar  1 16:40 emcli
drwxrwxr-x.  4 grid oinstall  4096 Mar  1 16:40 dv
drwxrwxr-x.  8 grid oinstall  4096 Mar  1 16:40 ide
drwxrwxr-x.  8 grid oinstall  4096 Mar  1 16:40 javavm
drwxrwxr-x.  4 grid oinstall  4096 Mar  1 16:40 jdev
drwxrwxr-x.  6 grid oinstall  4096 Mar  1 16:40 md
drwxrwxr-x.  6 grid oinstall  4096 Mar  1 16:40 nls
drwxrwxr-x.  6 grid oinstall  4096 Mar  1 16:40 odbc
drwxrwxr-x.  5 grid oinstall  4096 Mar  1 16:40 olap
drwxrwxr-x.  4 grid oinstall  4096 Mar  1 16:40 oracore
drwxrwxr-x.  8 grid oinstall  4096 Mar  1 16:40 ord
drwxrwxr-x.  4 grid oinstall  4096 Mar  1 16:40 owm
drwxrwxr-x.  7 grid oinstall  4096 Mar  1 16:40 precomp
drwxrwxr-x.  4 grid oinstall  4096 Mar  1 16:40 scheduler
drwxrwxr-x.  4 grid oinstall  4096 Mar  1 16:40 relnotes
drwxrwxr-x.  3 grid oinstall  4096 Mar  1 16:40 slax
drwxrwxr-x.  3 grid oinstall  4096 Mar  1 16:40 sqlj
drwxrwxr-x.  3 grid oinstall  4096 Mar  1 16:40 wwg
drwxrwxr-x.  3 grid oinstall  4096 Mar  1 16:40 usm
drwxrwxr-x.  4 grid oinstall  4096 Mar  1 16:40 uix
drwxrwxr-x.  6 grid oinstall  4096 Mar  1 16:40 plsql
drwxrwxr-x.  2 grid oinstall  4096 Mar  1 16:40 utl
drwxrwxr-x.  7 grid oinstall  4096 Mar  1 16:40 xdk
drwxrwxr-x.  2 grid oinstall  4096 Mar  1 16:40 instantclient
drwxrwxr-x. 20 grid oinstall  4096 Mar  1 16:40 oc4j
drwxrwxr-x.  2 grid oinstall  4096 Mar  1 16:40 timingframework
drwxrwxr-x.  4 grid oinstall  4096 Mar  1 16:41 clone
drwxrwxr-x.  4 grid oinstall  4096 Mar  1 16:41 j2ee
drwxrwxr-x.  8 grid oinstall  4096 Mar  1 16:41 apex
drwxrwxr-x. 13 grid oinstall  4096 Mar  1 16:41 sqldeveloper
drwxrwxr-x.  2 grid oinstall  4096 Mar  1 16:41 jlib
drwxrwxr-x.  2 grid oinstall  4096 Mar  1 16:41 dc_ocm
drwxrwxr-x.  6 grid oinstall  4096 Mar  1 16:41 jdk
drwxrwxr-x.  3 grid oinstall  4096 Mar  1 16:41 jdbc
drwxrwxr-x.  3 grid oinstall  4096 Mar  1 16:41 ucp
drwxrwxr-x.  7 grid oinstall  4096 Mar  1 16:41 OPatch
drwxrwxr-x.  7 grid oinstall  4096 Mar  1 16:41 ccr
drwxrwxr-x. 26 grid oinstall  4096 Mar  1 16:41 owb
drwxrwxr-x.  6 grid oinstall  4096 Mar  1 16:41 mgw
drwxrwxr-x.  7 grid oinstall  4096 Mar  1 16:41 opmn
drwxrwxr-x.  3 grid oinstall  4096 Mar  1 16:41 cdata
drwxrwxr-x.  6 grid oinstall  4096 Mar  1 16:41 css
drwxrwxr-x.  2 grid oinstall  4096 Mar  1 16:41 mesg
drwxrwxr-x.  2 grid oinstall  4096 Mar  1 16:41 config
drwxrwxr-x.  3 grid oinstall  4096 Mar  1 16:41 EMStage
drwxrwxr-x. 12 grid oinstall  4096 Mar  1 16:41 ldap
drwxrwxr-x. 15 grid oinstall  4096 Mar  1 16:41 sysman
drwxrwxr-x.  9 grid oinstall  4096 Mar  1 16:41 srvm
drwxrwxr-x.  7 grid oinstall  4096 Mar  1 16:41 racg
drwxrwxr-x. 10 grid oinstall  4096 Mar  1 16:41 ctx
drwxrwxr-x.  7 grid oinstall  4096 Mar  1 16:42 sqlplus
drwxrwxr-x.  8 grid oinstall  4096 Mar  1 16:42 oui
drwxrwxr-x.  4 grid oinstall 12288 Mar  1 16:42 lib
drwxrwxr-x.  5 grid oinstall  4096 Mar  1 16:42 perl
drwxrwxr-x. 11 grid oinstall  4096 Mar  1 16:42 network
drwxrwxr-x.  4 grid oinstall  4096 Mar  1 16:42 deinstall
drwxrwxr-x.  5 grid oinstall  4096 Mar  1 16:42 hs
-rwxrwxr-x.  1 grid oinstall   494 Mar  1 16:42 root.sh
drwxrwxr-x. 13 grid oinstall  4096 Mar  1 16:42 rdbms
drwxrwxr-x. 13 grid oinstall  4096 Mar  1 16:42 inventory
drwxrwxr-x.  4 grid oinstall  4096 Mar  1 16:42 cfgtoollogs
drwxrwxr-x.  7 grid oinstall  4096 Mar  1 16:44 install
drwxrwxr-x.  2 grid oinstall 12288 Mar  1 16:44 bin
drwxrwxr-x.  4 grid oinstall  4096 Mar  4 18:49 log
drwxrwxr-x.  2 grid oinstall  4096 Jun 30 22:31 dbs

处理方法

root用户
chown -R oracle:oinstall /u01
chown oracle:oinstall /u01/app/oracle
oracle用户
chmod 6751 $ORACLE_HOME/bin/oracle

这里的修改权限,为了保证业务运行正常,是尽量往大的方向修改的,如果条件允许,在后期有条件的情况下,建议重新安装oracle软件

ORA-00600[kjhn_post_ha_alert0-862]原因分析

数据库版本和平台信息
数据库版本为10.2.0.1版本,而且是32位的win 2003 sp2之上

ORACLE V10.2.0.1.0 - Production vsnsta=0
vsnsql=14 vsnxtr=3
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
Windows Server 2003 Version V5.2 Service Pack 2
CPU                 : 2 - type 586, 1 Physical Cores
Process Affinity    : 0x00000000
Memory (Avail/Total): Ph:2608M/3990M, Ph+PgF:4511M/5871M, VA:1242M/2047M
Instance name: orcl

数据库报大量ORA-600[kjhn_post_ha_alert0-862]错误
数据库的mmon进程报大量ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [], [], [], [], []错误

Wed Jun 03 21:50:40 2015
Restarting dead background process MMON
MMON started with pid=11, OS id=3804
Wed Jun 03 21:50:43 2015
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_mmon_3804.trc:
ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [], [], [], [], []
Wed Jun 03 21:50:49 2015
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_mmon_3804.trc:
ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [], [], [], [], []
Wed Jun 03 21:55:44 2015
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_mmon_3804.trc:
ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [], [], [], [], []
Wed Jun 03 21:55:49 2015
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_mmon_3804.trc:
ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [], [], [], [], []
Wed Jun 03 22:00:40 2015
Thread 1 advanced to log sequence 476
  Current log# 1 seq# 476 mem# 0: E:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO01.LOG
Wed Jun 03 22:00:44 2015
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_mmon_3804.trc:
ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [], [], [], [], []

查询对应trace文件发现

ORA-00600: internal error code, arguments: [kjhn_post_ha_alert0-862], [], [], [] , [], [], [], []
Current SQL statement for this session:
BEGIN :success := dbms_ha_alerts_prvt.check_ha_resources; END;

人工执行该过程

SQL> var success varchar2
SQL> begin
  2  :success := sys.dbms_ha_alerts_prvt.check_ha_resources;
  3  end;
  4  /
PL/SQL procedure successfully completed.
SQL> print success
SUCCESS
--------------------------------
N

通过查询相关资料得到如下说明

@ This check is triggered with FAN enabled at this instance and it seems to be
@ associated with a startup action. From the procedure itself which is called
@ this is a run-once MMON (startup) action which supports instance down
@ notification reliability. It does the folowing a) registers the current
@ instance incarnation in recent_resource_incarnations$ if it's not already
@ there b) deletes recent_resource_incarnations$ records that don't apply to
@ this database. They may, e.g., have been copied from seed db or from a former
@ DataGuard primary c) scans recent_resource_incarnations$ for instance
@ incarnations that are no longer alive, and submits instance down alerts for
@ them . If all is good then return 'Y' else 'N' (or error) if there is a
@ failure. That failure is to get back to MMON, so that it may retry this
@ action later. In the local instance I get a 'Y' but in the customer's system
@ it fails with a 'N' which seems related to the ORA-600 assert.
@ This function is kjhn_post_ha_alert0() which is internal and does the real work of
@ posting HA alerts. It is used by both kjhn_post_ha_alert and
@ kjn_post_ha_alert_plsql. Its parameters are basically the same as those of
@ kjhn_post_ha_alert,other than the fact that it uses individual parameters
@ rather than the more easily extensible structure. Also the parameters passed
@ to it are the instance_name and the host_name which is the kernelized
@ implementation for posting HA alerts. Without actually having the arguments
@ the guess is that either the host_name or the instance_name raised in the
@ assert is null which triggered it.

mmon进程尝试调用相关程序,然后无法得出正确值,返回N,然后会一直尝试,如果不能得到返回Y,就会一直报ORA-600,错误.通过上述的三种情况来说,都和recent_resource_incarnations$表有关系.
该故障原因是由于:mmon在调用kjhn_post_ha_alert0函数在执行的时候,如果发现参数host_name或者instance_name为null,就会报该错误出来.

处理方法
This problem has been documented as Bug 5173066 REPEATED ORA-600 [KJHN_POST_HA_ALERT0-862] FROM MMON PROCESS.
The bug is fixed in 11.1.0.6. A workaround is available for the problem.
该bug在11.1.0.6中得以修复

To implement the workaround, please execute the following steps as the SYS user:
1. Collect the following information and spool it to a file for your records.
a. output of select * from v$instance
b. show parameter instance_name
c. set pages 1000
d. select * from recent_resource_incarnations$
2. Create a backup table of recent_resource_incarnations$.
SQL> create table recent_resource_inc$bk as select * from recent_resource_incarnations$;
3. Truncate recent_resource_incarnations$. Be sure to do this while the instance is up and running.
    Do not issue this statement if a shutdown is pending.
SQL> truncate table recent_resource_incarnations$;
4. Perform a clean shutdown, followed by a startup.

具体参考:
ORA-600 [kjhn_post_ha_alert0-862] Continuously Repeated in the Alert Log (Doc ID 401640.1)
Bug 5173066 : REPEATED ORA-600 [KJHN_POST_HA_ALERT0-862] FROM MMON PROCESS

PostgreSQL简单操作之—创建库,登录,ddl,dml,help,登出,删除库

PostgreSQL创建数据库
使用shell级别的createdb命令创建xifenfei库

-bash-3.2$ createdb xifenfei

系统认证登录PostgreSQL数据库
使用psql登录PostgreSQL中的xifenfei数据库

-bash-3.2$ psql xifenfei
psql (9.4.4)
Type "help" for help.

PostgreSQL简单查询测试
通过查询版本,当前日期,简单加法等sql语句,测试PostgreSQL中的sql操作

xifenfei=#
xifenfei=# SELECT version();
                                                    version
---------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55), 64-bit
(1 row)
xifenfei=# SELECT current_date;
    date
------------
 2015-06-21
(1 row)
xifenfei=# SELECT 2 + 2;
 ?column?
----------
        4
(1 row)

PostgreSQL数据库简单ddl和dml测试
通过创建表,插入/更新/删除记录,删除表等操作

xifenfei=# create table t_xifenfei(id int,name varchar(100));
CREATE TABLE
xifenfei=# insert into t_xifenfei values(1,'www.xifenfei.com');
INSERT 0 1
xifenfei=# select * from t_xifenfei;
 id |       name
----+------------------
  1 | www.xifenfei.com
(1 row)
xifenfei=# insert into t_xifenfei values(2,'www.orasos.com');
INSERT 0 1
xifenfei=# select * from t_xifenfei;
 id |       name
----+------------------
  1 | www.xifenfei.com
  2 | www.orasos.com
(2 rows)
xifenfei=# update t_xifenfei set name='WWW.XIFENFEI.COM' WHERE ID=2;
UPDATE 1
xifenfei=#  select * from t_xifenfei;
 id |       name
----+------------------
  1 | www.xifenfei.com
  2 | WWW.XIFENFEI.COM
(2 rows)
xifenfei=# delete from t_xifenfei where id=2;
DELETE 1
xifenfei=# select * from t_xifenfei;
 id |       name
----+------------------
  1 | www.xifenfei.com
(1 row)
xifenfei=# drop table t_xifenfei;
DROP TABLE
xifenfei=# select * from t_xifenfei;
ERROR:  relation "t_xifenfei" does not exist
LINE 1: select * from t_xifenfei;
                      ^

PostgreSQL数据库帮助使用方法
PostgreSQL数据库使用\h命令来查看帮助

xifenfei=# \h
Available help:
  ABORT                            CLUSTER                          DECLARE                          EXPLAIN
  ALTER AGGREGATE                  COMMENT                          DELETE                           FETCH
  ALTER COLLATION                  COMMIT                           DISCARD                          GRANT
  ALTER CONVERSION                 COMMIT PREPARED                  DO                               INSERT
  ALTER DATABASE                   COPY                             DROP AGGREGATE                   LISTEN
  ALTER DEFAULT PRIVILEGES         CREATE AGGREGATE                 DROP CAST                        LOAD
  ALTER DOMAIN                     CREATE CAST                      DROP COLLATION                   LOCK
  ALTER EVENT TRIGGER              CREATE COLLATION                 DROP CONVERSION                  MOVE
  ALTER EXTENSION                  CREATE CONVERSION                DROP DATABASE                    NOTIFY
  ALTER FOREIGN DATA WRAPPER       CREATE DATABASE                  DROP DOMAIN                      PREPARE
  ALTER FOREIGN TABLE              CREATE DOMAIN                    DROP EVENT TRIGGER               PREPARE TRANSACTION
  ALTER FUNCTION                   CREATE EVENT TRIGGER             DROP EXTENSION                   REASSIGN OWNED
  ALTER GROUP                      CREATE EXTENSION                 DROP FOREIGN DATA WRAPPER        REFRESH MATERIALIZED VIEW
  ALTER INDEX                      CREATE FOREIGN DATA WRAPPER      DROP FOREIGN TABLE               REINDEX
  ALTER LANGUAGE                   CREATE FOREIGN TABLE             DROP FUNCTION                    RELEASE SAVEPOINT
  ALTER LARGE OBJECT               CREATE FUNCTION                  DROP GROUP                       RESET
  ALTER MATERIALIZED VIEW          CREATE GROUP                     DROP INDEX                       REVOKE
  ALTER OPERATOR                   CREATE INDEX                     DROP LANGUAGE                    ROLLBACK
  ALTER OPERATOR CLASS             CREATE LANGUAGE                  DROP MATERIALIZED VIEW           ROLLBACK PREPARED
  ALTER OPERATOR FAMILY            CREATE MATERIALIZED VIEW         DROP OPERATOR                    ROLLBACK TO SAVEPOINT
  ALTER ROLE                       CREATE OPERATOR                  DROP OPERATOR CLASS              SAVEPOINT
  ALTER RULE                       CREATE OPERATOR CLASS            DROP OPERATOR FAMILY             SECURITY LABEL
  ALTER SCHEMA                     CREATE OPERATOR FAMILY           DROP OWNED                       SELECT
  ALTER SEQUENCE                   CREATE ROLE                      DROP ROLE                        SELECT INTO
  ALTER SERVER                     CREATE RULE                      DROP RULE                        SET
  ALTER SYSTEM                     CREATE SCHEMA                    DROP SCHEMA                      SET CONSTRAINTS
  ALTER TABLE                      CREATE SEQUENCE                  DROP SEQUENCE                    SET ROLE
  ALTER TABLESPACE                 CREATE SERVER                    DROP SERVER                      SET SESSION AUTHORIZATION
  ALTER TEXT SEARCH CONFIGURATION  CREATE TABLE                     DROP TABLE                       SET TRANSACTION
  ALTER TEXT SEARCH DICTIONARY     CREATE TABLE AS                  DROP TABLESPACE                  SHOW
  ALTER TEXT SEARCH PARSER         CREATE TABLESPACE                DROP TEXT SEARCH CONFIGURATION   START TRANSACTION
  ALTER TEXT SEARCH TEMPLATE       CREATE TEXT SEARCH CONFIGURATION DROP TEXT SEARCH DICTIONARY      TABLE
  ALTER TRIGGER                    CREATE TEXT SEARCH DICTIONARY    DROP TEXT SEARCH PARSER          TRUNCATE
  ALTER TYPE                       CREATE TEXT SEARCH PARSER        DROP TEXT SEARCH TEMPLATE        UNLISTEN
  ALTER USER                       CREATE TEXT SEARCH TEMPLATE      DROP TRIGGER                     UPDATE
  ALTER USER MAPPING               CREATE TRIGGER                   DROP TYPE                        VACUUM
  ALTER VIEW                       CREATE TYPE                      DROP USER                        VALUES
  ANALYZE                          CREATE USER                      DROP USER MAPPING                WITH
  BEGIN                            CREATE USER MAPPING              DROP VIEW
  CHECKPOINT                       CREATE VIEW                      END
  CLOSE                            DEALLOCATE                       EXECUTE
xifenfei-# \h CREATE TABLE AS
Command:     CREATE TABLE AS
Description: define a new table from the results of a query
Syntax:
CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE table_name
    [ (column_name [, ...] ) ]
    [ WITH ( storage_parameter [= value] [, ... ] ) | WITH OIDS | WITHOUT OIDS ]
    [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
    [ TABLESPACE tablespace_name ]
    AS query
    [ WITH [ NO ] DATA ]

PostgreSQL退出登录

xifenfei-# \q
-bash-3.2$

PostgreSQL删除数据库

-bash-3.2$ dropdb xifenfei
-bash-3.2$ psql xifenfei
psql: FATAL:  database "xifenfei" does not exist

在linux上安装PostgreSQL 9.4并启动和关闭数据库

PostgreSQL是以加州大学伯克利分校计算机系开发的 POSTGRES, Version 4.2为基础的对象关系型数据库管理系统(ORDBMS)。POSTGRES开创的许多概念在很久以后才出现在商业数据库中。PostgreSQL是最初伯克利代码的一个开放源码的继承者。它支持大部分SQL标准并且提供了许多其它现代特性:
复杂查询
外键
触发器
可更新的视图
事务完整性
多版本并发控制

另外,PostgreSQL可以用许多方法进行扩展,比如通过增加新的:
数据类型
函数
操作符
聚合函数
索引方法
过程语言
在个人看来,在开源数据库中PostgreSQL 是和ORACLE最相近的一个,和ORACLE兼容性较好,如果去IOE,该数据库是一个不错的选择

操作系统版本

[root@web103 ~]# more /etc/issue
CentOS release 5.9 (Final)
Kernel \r on an \m
[root@web103 ~]# uname -a
Linux web103 2.6.18-348.el5 #1 SMP Tue Jan 8 17:53:53 EST 2013 x86_64 x86_64 x86_64 GNU/Linux

下载对应PostgreSQL对应rpm包
因为操作系统版本为CentOS 5.9的64位Linux,因此下载对应版本prm包,主要下载了server,client,contrib,libs四个包

[root@web103 ~]# mkdir pg
[root@web103 ~]# cd pg
[root@web103 pg]# wget http://yum.postgresql.org/9.4/redhat/rhel-5-x86_64/postgresql94-9.4.4-1PGDG.rhel5.x86_64.rpm
--2015-06-16 20:44:52--  http://yum.postgresql.org/9.4/redhat/rhel-5-x86_64/postgresql94-9.4.4-1PGDG.rhel5.x86_64.rpm
Resolving yum.postgresql.org... 174.143.35.196, 2001:4800:1501:1::196
Connecting to yum.postgresql.org|174.143.35.196|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1807607 (1.7M) [application/x-redhat-package-manager]
Saving to: `postgresql94-9.4.4-1PGDG.rhel5.x86_64.rpm'
100%[===================================================================================================================>] 1,807,607   73.6K/s   in 30s
2015-06-16 20:45:24 (58.1 KB/s) - `postgresql94-9.4.4-1PGDG.rhel5.x86_64.rpm' saved [1807607/1807607]
[root@web103 pg]# wget http://yum.postgresql.org/9.4/redhat/rhel-5-x86_64/postgresql94-server-9.4.4-1PGDG.rhel5.x86_64.rpm
--2015-06-16 20:45:35--  http://yum.postgresql.org/9.4/redhat/rhel-5-x86_64/postgresql94-server-9.4.4-1PGDG.rhel5.x86_64.rpm
Resolving yum.postgresql.org... 174.143.35.196, 2001:4800:1501:1::196
Connecting to yum.postgresql.org|174.143.35.196|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6175991 (5.9M) [application/x-redhat-package-manager]
Saving to: `postgresql94-server-9.4.4-1PGDG.rhel5.x86_64.rpm'
100%[===================================================================================================================>] 6,175,991   58.5K/s   in 2m 4s
2015-06-16 20:47:42 (48.6 KB/s) - `postgresql94-server-9.4.4-1PGDG.rhel5.x86_64.rpm' saved [6175991/6175991]
[root@web103 pg]# wget http://yum.postgresql.org/9.4/redhat/rhel-5-x86_64/postgresql94-contrib-9.4.4-1PGDG.rhel5.x86_64.rpm
--2015-06-16 20:47:51--  http://yum.postgresql.org/9.4/redhat/rhel-5-x86_64/postgresql94-contrib-9.4.4-1PGDG.rhel5.x86_64.rpm
Resolving yum.postgresql.org... 174.143.35.196, 2001:4800:1501:1::196
Connecting to yum.postgresql.org|174.143.35.196|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 664051 (648K) [application/x-redhat-package-manager]
Saving to: `postgresql94-contrib-9.4.4-1PGDG.rhel5.x86_64.rpm'
100%[===================================================================================================================>] 664,051     28.1K/s   in 53s
2015-06-16 20:48:46 (12.3 KB/s) - `postgresql94-contrib-9.4.4-1PGDG.rhel5.x86_64.rpm' saved [664051/664051]
[root@web103 pg]# wget http://yum.postgresql.org/9.4/redhat/rhel-5-x86_64/postgresql94-libs-9.4.4-1PGDG.rhel5.x86_64.rpm
--2015-06-16 20:51:10--  http://yum.postgresql.org/9.4/redhat/rhel-5-x86_64/postgresql94-libs-9.4.4-1PGDG.rhel5.x86_64.rpm
Resolving yum.postgresql.org... 174.143.35.196, 2001:4800:1501:1::196
Connecting to yum.postgresql.org|174.143.35.196|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 233206 (228K) [application/x-redhat-package-manager]
Saving to: `postgresql94-libs-9.4.4-1PGDG.rhel5.x86_64.rpm'
100%[===================================================================================================================>] 233,206     70.3K/s   in 3.2s
2015-06-16 20:51:16 (70.3 KB/s) - `postgresql94-libs-9.4.4-1PGDG.rhel5.x86_64.rpm' saved [233206/233206]
[root@web103 pg]# ls
postgresql94-9.4.4-1PGDG.rhel5.x86_64.rpm          postgresql94-libs-9.4.4-1PGDG.rhel5.x86_64.rpm
postgresql94-contrib-9.4.4-1PGDG.rhel5.x86_64.rpm  postgresql94-server-9.4.4-1PGDG.rhel5.x86_64.rpm

安装PostgreSQL rpm包

[root@web103 pg]# rpm -ivh *.rpm
warning: postgresql94-9.4.4-1PGDG.rhel5.x86_64.rpm: Header V3 DSA signature: NOKEY, key ID 442df0f8
Preparing...                ########################################### [100%]
   1:postgresql94-libs      ########################################### [ 25%]
   2:postgresql94           ########################################### [ 50%]
   3:postgresql94-contrib   ########################################### [ 75%]
   4:postgresql94-server    ########################################### [100%]

创建PostgreSQL 默认库

[root@web103 pg]# service postgresql-9.4 initdb
Initializing database: [  OK  ]
[root@web103 data]# pwd
/var/lib/pgsql/9.4/data
[root@web103 data]# ls -ltr
total 120
-rw------- 1 postgres postgres 21265 Jun 16 20:52 postgresql.conf
-rw------- 1 postgres postgres    88 Jun 16 20:52 postgresql.auto.conf
drwx------ 3 postgres postgres  4096 Jun 16 20:52 pg_xlog
-rw------- 1 postgres postgres     4 Jun 16 20:52 PG_VERSION
drwx------ 2 postgres postgres  4096 Jun 16 20:52 pg_twophase
drwx------ 2 postgres postgres  4096 Jun 16 20:52 pg_tblspc
drwx------ 2 postgres postgres  4096 Jun 16 20:52 pg_subtrans
drwx------ 2 postgres postgres  4096 Jun 16 20:52 pg_snapshots
drwx------ 2 postgres postgres  4096 Jun 16 20:52 pg_serial
drwx------ 2 postgres postgres  4096 Jun 16 20:52 pg_replslot
drwx------ 4 postgres postgres  4096 Jun 16 20:52 pg_multixact
drwx------ 4 postgres postgres  4096 Jun 16 20:52 pg_logical
-rw------- 1 postgres postgres  1636 Jun 16 20:52 pg_ident.conf
-rw------- 1 postgres postgres  4224 Jun 16 20:52 pg_hba.conf
drwx------ 2 postgres postgres  4096 Jun 16 20:52 pg_dynshmem
drwx------ 2 postgres postgres  4096 Jun 16 20:52 pg_clog
drwx------ 5 postgres postgres  4096 Jun 16 20:52 base
drwx------ 2 postgres postgres  4096 Jun 16 21:16 pg_log
drwx------ 2 postgres postgres  4096 Jun 16 21:16 global
-rw------- 1 postgres postgres    80 Jun 16 21:39 postmaster.pid
-rw------- 1 postgres postgres    59 Jun 16 21:39 postmaster.opts
drwx------ 2 postgres postgres  4096 Jun 16 21:39 pg_stat
drwx------ 2 postgres postgres  4096 Jun 16 21:39 pg_notify
drwx------ 2 postgres postgres  4096 Jun 16 22:00 pg_stat_tmp

另外还可以通过如下两种方式创建

initdb -D /var/lib/pgsql/9.4/data
pg_ctl -D /var/lib/pgsql/9.4/data

设置PostgreSQL 开机自动启动

[root@web103 pg]# chkconfig postgresql-9.4 on
[root@web103 pg]# chkconfig --list|grep post
postgresql-9.4  0:off   1:off   2:on    3:on    4:on    5:on    6:off

查看默认创建PostgreSQL 用户

[root@web103 data]# more /etc/passwd|grep post
postgres:x:26:26:PostgreSQL Server:/var/lib/pgsql:/bin/bash

增加PATH环境变量

-bash-3.2$echo "PATH=$PATH:/usr/pgsql-9.4/bin;export PATH" >>~/.bash_profile

启动PostgreSQL 数据库

--方法1
-bash-3.2$ postgres -D /var/lib/pgsql/9.4/data
----或者
-bash-3.2$ postgres -D /var/lib/pgsql/9.4/data >~/pg.log 2>&1 &
方法2
pg_ctl start -l ~/pg.log -D /var/lib/pgsql/9.4/data

这里如果在环境变量中配置了PGDATA,那-D也可以不指定,-l为指定日志目录,建议使用封装的方法2启动pg

查看PostgreSQL进程信息

[root@web103 data]# ps -ef|grep post|grep -v grep
postgres  4432     1  0 21:39 ?        00:00:00 /usr/pgsql-9.4/bin/postgres -D /var/lib/pgsql/9.4/data
postgres  4433  4432  0 21:39 ?        00:00:00 postgres: logger process
postgres  4435  4432  0 21:39 ?        00:00:00 postgres: checkpointer process
postgres  4436  4432  0 21:39 ?        00:00:00 postgres: writer process
postgres  4437  4432  0 21:39 ?        00:00:00 postgres: wal writer process
postgres  4438  4432  0 21:39 ?        00:00:00 postgres: autovacuum launcher process
postgres  4439  4432  0 21:39 ?        00:00:00 postgres: stats collector process

这里可以看到pg也和oracle有几分类似,有日志进程,checkpoint进程,有写进程等(具体以后分析)

停止PostgreSQL数据库

-bash-3.2$ pg_ctl stop -D /var/lib/pgsql/9.4/data
waiting for server to shut down.... done
server stopped
-bash-3.2$  ps -ef|grep post|grep -v grep
root      6036   499  0 22:07 pts/0    00:00:00 su - postgres
postgres  6037  6036  0 22:07 pts/0    00:00:00 -bash
postgres  6113  6037  0 22:08 pts/0    00:00:00 ps -ef

PostgreSQL默认监听端口
在后续章节中进一步讲解相关配置和访问

[root@web103 pgsql]# netstat -natp|grep postgres
tcp        0      0 127.0.0.1:5432              0.0.0.0:*                   LISTEN      4432/postgres

记录解决一次Listener状态为Not All Endpoints Registered的故障

客户反馈系统异常无法正常访问,检查发现监听异常

C:\Users\Administrator>crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       rac2
ora.LISTENER.lsnr
               ONLINE  INTERMEDIATE rac2                     Not All Endpoints R
                                                             egistered
ora.asm
               ONLINE  ONLINE       rac2                     Started
ora.gsd
               OFFLINE OFFLINE      rac2
ora.net1.network
               ONLINE  ONLINE       rac2
ora.ons
               ONLINE  ONLINE       rac2
ora.registry.acfs
               ONLINE  ONLINE       rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  INTERMEDIATE rac2                     Not All Endpoints R
                                                             egistered
ora.cvu
      1        ONLINE  ONLINE       rac2
ora.oc4j
      1        ONLINE  ONLINE       rac2
ora.rac.db
      1        ONLINE  ONLINE       rac2                     Open
      2        ONLINE  OFFLINE
ora.rac1.vip
      1        ONLINE  OFFLINE
ora.rac2.vip
      1        ONLINE  OFFLINE
ora.scan1.vip
      1        ONLINE  OFFLINE
C:\Users\Administrator>lsnrctl status
LSNRCTL for 64-bit Windows: Version 11.2.0.3.0 - Production on 12-6月 -2015 15:50:43
Copyright (c) 1991, 2011, Oracle.  All rights reserved.
正在连接到 (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
LISTENER 的 STATUS
------------------------
别名                      LISTENER
版本                      TNSLSNR for 64-bit Windows: Version 11.2.0.3.0 - Production
启动日期                  12-6月 -2015 15:31:30
正常运行时间              0 天 0 小时 19 分 20 秒
跟踪级别                  off
安全性                    ON: Local OS Authentication
SNMP                      OFF
监听程序参数文件          D:\app\11.2.0\grid\network\admin\listener.ora
监听程序日志文件          D:\app\11.2.0\grid\log\diag\tnslsnr\rac2\listener\alert\log.xml
监听端点概要...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=\\.\pipe\LISTENERipc)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.63.64.70)(PORT=1521)))
监听程序不支持服务
命令执行成功

通过这里可以看到LISTENER和LISTENER_SCAN1为Not All Endpoints Registered状态,而且这个RAC只有一个节点rac2,rac1节点未加入到集群中.进一步检查IP和hosts文件

C:\Users\Administrator>ipconfig -all
Windows IP 配置
   主机名  . . . . . . . . . . . . . : rac2
   主 DNS 后缀 . . . . . . . . . . . :
   节点类型  . . . . . . . . . . . . : 混合
   IP 路由已启用 . . . . . . . . . . : 否
   WINS 代理已启用 . . . . . . . . . : 否
以太网适配器 pub:
   连接特定的 DNS 后缀 . . . . . . . :
   描述. . . . . . . . . . . . . . . : Intel(R) 82576 Gigabit Dual Port Network Connection #2
   物理地址. . . . . . . . . . . . . : 00-25-90-5A-0F-47
   DHCP 已启用 . . . . . . . . . . . : 否
   自动配置已启用. . . . . . . . . . : 是
   本地链接 IPv6 地址. . . . . . . . : fe80::c5ef:663f:7333:45f2%12(首选)
   IPv4 地址 . . . . . . . . . . . . : 10.63.64.70(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.192
   默认网关. . . . . . . . . . . . . : 10.63.64.126
   DHCPv6 IAID . . . . . . . . . . . : 301999504
   DHCPv6 客户端 DUID  . . . . . . . : 00-01-00-01-1A-5C-19-A1-00-25-90-5A-0F-46
   DNS 服务器  . . . . . . . . . . . : 218.30.19.40
   TCPIP 上的 NetBIOS  . . . . . . . : 已启用
以太网适配器 priv:
   连接特定的 DNS 后缀 . . . . . . . :
   描述. . . . . . . . . . . . . . . : Intel(R) 82576 Gigabit Dual Port Network Connection
   物理地址. . . . . . . . . . . . . : 00-25-90-5A-0F-46
   DHCP 已启用 . . . . . . . . . . . : 否
   自动配置已启用. . . . . . . . . . : 是
   本地链接 IPv6 地址. . . . . . . . : fe80::c88d:78ff:d2e8:bde1%11(首选)
   IPv4 地址 . . . . . . . . . . . . : 10.10.1.2(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.0
   默认网关. . . . . . . . . . . . . :
   DHCPv6 IAID . . . . . . . . . . . : 234890640
   DHCPv6 客户端 DUID  . . . . . . . : 00-01-00-01-1A-5C-19-A1-00-25-90-5A-0F-46
   DNS 服务器  . . . . . . . . . . . : fec0:0:0:ffff::1%1
                                       fec0:0:0:ffff::2%1
                                       fec0:0:0:ffff::3%1
   TCPIP 上的 NetBIOS  . . . . . . . : 已启用
--hosts文件
10.63.64.69		rac1
10.63.64.70		rac2
10.63.64.71		rac1-vip
10.63.64.72		rac2-vip
10.63.64.73		scan-cluster
10.10.1.1		rac1-priv
10.10.1.2		rac2-priv

这里可以看到主机之上的pub网卡只有一个ip 10.63.64.70,不太符合我们对rac的理解(一般来说其上应该有vip,部分情况下甚至可能有scan ip),尝试ping vip和scan ip

C:\Users\Administrator>ping 10.63.64.72
正在 Ping 10.63.64.72 具有 32 字节的数据:
来自 10.63.64.72 的回复: 字节=32 时间<1ms TTL=128
来自 10.63.64.72 的回复: 字节=32 时间<1ms TTL=128
来自 10.63.64.72 的回复: 字节=32 时间<1ms TTL=128
10.63.64.72 的 Ping 统计信息:
    数据包: 已发送 = 3,已接收 = 3,丢失 = 0 (0% 丢失),
往返行程的估计时间(以毫秒为单位):
    最短 = 0ms,最长 = 0ms,平均 = 0ms
Control-C
^C
C:\Users\Administrator>ping 10.63.64.73
正在 Ping 10.63.64.73 具有 32 字节的数据:
来自 10.63.64.73 的回复: 字节=32 时间<1ms TTL=128
来自 10.63.64.73 的回复: 字节=32 时间<1ms TTL=128
来自 10.63.64.73 的回复: 字节=32 时间<1ms TTL=128
10.63.64.73 的 Ping 统计信息:
    数据包: 已发送 = 3,已接收 = 3,丢失 = 0 (0% 丢失),
往返行程的估计时间(以毫秒为单位):
    最短 = 0ms,最长 = 0ms,平均 = 0ms

这里发现一个异常问题:crs显示只有rac2在集群之中,而该主机ip中又不存在vip和scan ip属于异常情况,但是这两个ip又可以ping通,基于这样情况,我第一反应就是vip和scanip可能飘到rac1中了,而rac1又未正常加入到crs中(因为这个库以前处理过,由于rac1的hba卡有问题,数据库无法正常启动,crs起来也无法提供工作),检查rac1机器情况

C:\Users\Administrator>crsctl status res -t
CRS-4535: 无法与集群就绪服务通信
CRS-4000: 命令 Status 失败, 或已完成但出现错误。
C:\Users\Administrator>crsctl status res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       rac1                     Started
ora.crf
      1        ONLINE  ONLINE       rac1
ora.crsd
      1        ONLINE  OFFLINE
ora.cssd
      1        ONLINE  ONLINE       rac1
ora.cssdmonitor
      1        ONLINE  ONLINE       rac1
ora.ctssd
      1        ONLINE  ONLINE       rac1                     OBSERVER
ora.drivers.acfs
      1        ONLINE  ONLINE       rac1
ora.evmd
      1        ONLINE  ONLINE       rac1
ora.gipcd
      1        ONLINE  ONLINE       rac1
ora.gpnpd
      1        ONLINE  ONLINE       rac1
ora.mdnsd
      1        ONLINE  ONLINE       rac1
以太网适配器 pub:
   连接特定的 DNS 后缀 . . . . . . . :
   描述. . . . . . . . . . . . . . . : Intel(R) 82576 Gigabit Dual Port Network Connection
   物理地址. . . . . . . . . . . . . : 00-25-90-5A-0E-E7
   DHCP 已启用 . . . . . . . . . . . : 否
   自动配置已启用. . . . . . . . . . : 是
   本地链接 IPv6 地址. . . . . . . . : fe80::409d:8c2e:446b:af42%11(首选)
   IPv4 地址 . . . . . . . . . . . . : 10.63.64.69(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.192
   IPv4 地址 . . . . . . . . . . . . : 10.63.64.71(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.192
   IPv4 地址 . . . . . . . . . . . . : 10.63.64.72(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.192
   IPv4 地址 . . . . . . . . . . . . : 10.63.64.73(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.192
   默认网关. . . . . . . . . . . . . : 10.63.64.126
   DHCPv6 IAID . . . . . . . . . . . : 234890640
   DHCPv6 客户端 DUID  . . . . . . . : 00-01-00-01-1A-5C-19-0A-00-25-90-5A-0E-E7
   DNS 服务器  . . . . . . . . . . . : 8.8.8.8
   TCPIP 上的 NetBIOS  . . . . . . . : 已启用
以太网适配器 priv:
   连接特定的 DNS 后缀 . . . . . . . :
   描述. . . . . . . . . . . . . . . : Intel(R) 82576 Gigabit Dual Port Network Connection #2
   物理地址. . . . . . . . . . . . . : 00-25-90-5A-0E-E6
   DHCP 已启用 . . . . . . . . . . . : 否
   自动配置已启用. . . . . . . . . . : 是
   本地链接 IPv6 地址. . . . . . . . : fe80::154:dad7:f9e3:bea3%13(首选)
   IPv4 地址 . . . . . . . . . . . . : 10.10.1.1(首选)
   子网掩码  . . . . . . . . . . . . : 255.255.255.0
   默认网关. . . . . . . . . . . . . :
   DHCPv6 IAID . . . . . . . . . . . : 301999504
   DHCPv6 客户端 DUID  . . . . . . . : 00-01-00-01-1A-5C-19-0A-00-25-90-5A-0E-E7
   DNS 服务器  . . . . . . . . . . . : fec0:0:0:ffff::1%1
                                       fec0:0:0:ffff::2%1
                                       fec0:0:0:ffff::3%1
   TCPIP 上的 NetBIOS  . . . . . . . : 已启用

果然这里rac2的vip和scan ip都漂到rac1中,但是crs状态属于不正常情况,由于rac1无法正常使用,关闭该主机,并重启rac2(由于rac2处于异常情况无法正常工作),后续rac2恢复正常

C:\Users\Administrator>crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       rac2
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac2
ora.asm
               ONLINE  ONLINE       rac2                     Started
ora.gsd
               OFFLINE OFFLINE      rac2
ora.net1.network
               ONLINE  ONLINE       rac2
ora.ons
               ONLINE  ONLINE       rac2
ora.registry.acfs
               ONLINE  ONLINE       rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac2
ora.cvu
      1        ONLINE  ONLINE       rac2
ora.oc4j
      1        ONLINE  ONLINE       rac2
ora.rac.db
      1        OFFLINE OFFLINE                               Instance Shutdown
      2        ONLINE  ONLINE       rac2                     Open
ora.rac1.vip
      1        ONLINE  INTERMEDIATE rac2                     FAILED OVER
ora.rac2.vip
      1        ONLINE  ONLINE       rac2
ora.scan1.vip
      1        ONLINE  ONLINE       rac2
C:\Users\Administrator>lsnrctl status
LSNRCTL for 64-bit Windows: Version 11.2.0.3.0 - Production on 12-6月 -2015 17:02:46
Copyright (c) 1991, 2011, Oracle.  All rights reserved.
正在连接到 (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
LISTENER 的 STATUS
------------------------
别名                      LISTENER
版本                      TNSLSNR for 64-bit Windows: Version 11.2.0.3.0 - Production
启动日期                  12-6月 -2015 16:44:43
正常运行时间              0 天 0 小时 18 分 3 秒
跟踪级别                  off
安全性                    ON: Local OS Authentication
SNMP                      OFF
监听程序参数文件          D:\app\11.2.0\grid\network\admin\listener.ora
监听程序日志文件          D:\app\11.2.0\grid\log\diag\tnslsnr\rac2\listener\alert\log.xml
监听端点概要...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=\\.\pipe\LISTENERipc)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.63.64.70)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.63.64.72)(PORT=1521)))
服务摘要..
服务 "+ASM" 包含 1 个实例。
  实例 "+asm2", 状态 READY, 包含此服务的 1 个处理程序...
服务 "rac" 包含 1 个实例。
  实例 "rac2", 状态 READY, 包含此服务的 1 个处理程序...
服务 "racXDB" 包含 1 个实例。
  实例 "rac2", 状态 READY, 包含此服务的 1 个处理程序...
命令执行成功

出现该问题的原因至此可以总结出来:由于rac1和rac2的集群处于异常状态,rac1持有了vip和scan ip,但是又未正常加入crs,导致rac2无法获得vip和scan ip,从而使得LISTENER和LISTENER_SCAN1为Not All Endpoints Registered状态.另外对于不能正常工作的集群节点,建议关闭crs,甚至可以考虑关闭主机,减少异常节点对正常节点的影响.关于该类问题的分析,可以从Scan Listener In INTERMEDIATE Mode Not All Endpoints Registered (Doc ID 1667873.1)中找到依据,证明是由于IP被占用导致.

ORA-00230: operation disallowed: snapshot control file enqueue unavailable

rman 备份控制文件报ORA-00230: operation disallowed: snapshot control file enqueue unavailable错误

db1:/home/oracle>$rman target /
Recovery Manager: Release 10.2.0.3.0 - Production on Wed Jun 10 16:00:08 2015
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
connected to target database: ORCL (DBID=1150889877)
RMAN> backup current controlfile format '/tmp/xifenfei.ctl';
Starting backup at 10-JUN-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=628 instance=orcl1 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
waiting for snapshot control file enqueue
waiting for snapshot control file enqueue
waiting for snapshot control file enqueue
waiting for snapshot control file enqueue
waiting for snapshot control file enqueue
cannot make a snapshot control file
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 06/10/2015 16:03:10
ORA-00230: operation disallowed: snapshot control file enqueue unavailable

查看持有CF enqueue会话

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
SQL> SELECT s.SID, USERNAME AS "User", PROGRAM, MODULE,
  2  ACTION, LOGON_TIME "Logon"
  3  FROM V$SESSION s, V$ENQUEUE_LOCK l
  4  WHERE l.SID = s.SID
  5  AND l.TYPE = 'CF'
  6  AND l.ID1 = 0
  7  AND l.ID2 = 2;
       SID User
---------- ------------------------------
PROGRAM
------------------------------------------------
MODULE
------------------------------------------------
ACTION                           Logon
-------------------------------- ------------
       648 SYS
rman@db1 (TNS V1-V3)
backup full datafile
0000152 STARTED111               03-JUN-15

kill相关session

SQL> select spid from v$process where addr in(select paddr from v$session where sid=648);
SPID
------------
40108238
SQL> !ps -ef|grep 40108238
  oracle 39125244 65011720   0 15:59:27  pts/0  0:00 grep 40108238
  oracle 40108238        1   0   Jun 03      -  1:18 oracleorcl1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
db1:/home/oracle>$kill -9 40108238

再次测试备份控制文件–OK

db1:/home/oracle>$rman target /
Recovery Manager: Release 10.2.0.3.0 - Production on Wed Jun 10 16:05:06 2015
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
connected to target database: ORCL (DBID=1150889877)
RMAN> backup current controlfile format '/tmp/xifenfei.ctl';
Starting backup at 10-JUN-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=632 instance=orcl1 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current control file in backupset
channel ORA_DISK_1: starting piece 1 at 10-JUN-15
channel ORA_DISK_1: finished piece 1 at 10-JUN-15
piece handle=/tmp/xifenfei.ctl tag=TAG20150610T160516 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
Finished backup at 10-JUN-15

ORA-21561: OID generation failed故障解决

数据库无法登陆报ORA-21561: OID generation failed错误

[oracle@essc ~]$ sqlplus XIFENFEI/"www.xifenfei.com"@172.16.50.200/orcl
SQL*Plus: Release 11.2.0.4.0 Production on Mon Jun 1 16:52:29 2015
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
ERROR:
ORA-21561: OID generation failed
Enter user-name:
ERROR:
ORA-12545: Connect failed because target host or object does not exist
Enter user-name:
ERROR:
ORA-12545: Connect failed because target host or object does not exist
SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus

主机名无法ping通

[root@essc ~]# ping essc
ping: unknown host essc
[root@essc ~]# hostname
essc
[root@essc ~]# more /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.10.11 app1
[oracle@essc ~]$ ifconfig
eth3      Link encap:Ethernet  HWaddr 00:50:56:BB:00:6B
          inet addr:172.16.10.30  Bcast:172.16.10.255  Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:febb:6b/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:9597 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4018 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1549999 (1.4 MiB)  TX bytes:470158 (459.1 KiB)
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:164 errors:0 dropped:0 overruns:0 frame:0
          TX packets:164 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:11424 (11.1 KiB)  TX bytes:11424 (11.1 KiB)

修改hosts文件
让hosts中含主机名,也就是为了主机名能够ping通

[oracle@essc ~]$ more /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.10.11 app1
172.16.10.30 essc

数据库登录测试

[oracle@essc ~]$ sqlplus XIFENFEI/"www.xifenfei.com"@172.16.50.200/orcl
SQL*Plus: Release 11.2.0.4.0 Production on Mon Jun 1 16:56:39 2015
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL>
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

具体见官方说明
Ora-21561: OID Generation Failed (Doc ID 1335327.1)

APPLIES TO:
Oracle Net Services - Version 9.2.0.8 to 11.2.0.3 [Release 9.2 to 11.2]
Information in this document applies to any platform.
***Checked for relevance on 14-Jan-2013***
SYMPTOMS
When attempting to connect to the database using SQL*Plus or DBCA,
the following error occurs intermittently:
ERROR
-----------------------
ORA-21561: OID generation failed
CAUSE
This could be caused by not having the host name for the
target database fully qualified in the hosts file.
To verify if you are hitting this issue, the following symptoms should be met:
- ORA-21561: OID generation failed.
- Hosts file has un-fully qualified entry for the target database host:
127.0.0.1 loopback localhost # loopback (lo0) name/address
10.210.9.111 dbhost
In this sample, dbhost is the target db host.
This is reported in an unpublished Bug 12597261:
"ORA-21561 IF HOSTNAME ENVIRONMENT VARIABLE IS NOT FULLY QUALIFIED",
 which should be resolved as of 12G.
SOLUTION
Modify the hosts file to have the fully qualified host names,
by adding the fully qualified domain name to the entry.
127.0.0.1 loopback localhost # loopback (lo0) name/address
10.210.9.111 dbhost.sample.com
Verify that other environment and service handles are properly defined as well.
If this is a Windows environment, please check :
Windows: Connections Fail with ORA-12640 or ORA-21561 (Doc ID 744125.1)

再次建议:修改主机名请修改完全,具体参考:linux上安装oracle10g注意事项中修改主机名部分

sql profile改变hint的执行计划以及coe脚本实现sql id中无PLAN HASH VALUE的sql profile生成

在sql profile中使用的过程中,有以下几个问题,这里通过测试确认了几个问题:
1.coe_xfr_sql_profile可以执行sql_id中无PLAN HASH VALUE的执行计划
2.在sql使用过程中,sql profile是否会覆盖hint,通过测试证明sqlprofile可以覆盖hint的执行计划
3.coe_load_sql_profile可以使用于通过修改hint(改变sql id,然后通过指定两次不同的sql id实现sql profile固定hint的sql的执行计划)

测试sqlprofile会影响hint

SQL> create table t_xifenfei as select object_id,object_name from user_objects;
Table created.
SQL> create index idx_t_xifenfei_id on t_xifenfei(OBJECT_ID);
Index created.
SQL> create index idx_t_xifenfei2_id on t_xifenfei(OBJECT_ID,1);
Index created.
---使用hint等方式确定三种方式执行计划
SQL> SET LINES 150
SQL> SET AUTOT  ON
SQL> SET PAGES 150
SQL> select * from t_xifenfei t where OBJECT_ID=10;    <---默认使用IDX_T_XIFENFEI_ID index
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 308895000
-------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                   |     1 |    79 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T_XIFENFEI        |     1 |    79 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_T_XIFENFEI_ID |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_ID"=10)
Note
-----
   - dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
          6  recursive calls
          0  db block gets
         13  consistent gets
          1  physical reads
          0  redo size
        410  bytes sent via SQL*Net to client
        513  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed
SQL> SET AUTOT OFF
SQL> SET LINES 150
SQL> SET AUTOT  ON
SQL> SET PAGES 150
SQL> select /*+ INDEX(T idx_t_xifenfei2_id)*/* from t_xifenfei t where OBJECT_ID=10;
  <---指定使用idx_t_xifenfei2_id index
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2143066642
--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |     1 |    79 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T_XIFENFEI         |     1 |    79 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_T_XIFENFEI2_ID |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_ID"=10)
Note
-----
   - dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
         11  consistent gets
          1  physical reads
          0  redo size
        410  bytes sent via SQL*Net to client
        513  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed
SQL> SET AUTOT OFF
SQL> SET LINES 150
SQL> SET AUTOT  ON
SQL> SET PAGES 150
SQL> select /*+ INDEX(T idx_t_xifenfei_id)*/* from t_xifenfei t where OBJECT_ID=10;
  <---指定使用idx_t_xifenfei1_id index
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 308895000
-------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                   |     1 |    79 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T_XIFENFEI        |     1 |    79 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_T_XIFENFEI_ID |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_ID"=10)
Note
-----
   - dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
         11  consistent gets
          0  physical reads
          0  redo size
        410  bytes sent via SQL*Net to client
        513  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed
SQL> SET AUTOT OFF
--查询三种情况下sql_id
SQL> col SQL_TEXT for a50
SQL> SELECT SQL_TEXT,SQL_ID FROM V$SQL WHERE sql_text like 'select%OBJECT_ID=10';
SQL_TEXT                                           SQL_ID
-------------------------------------------------- -------------
select /*+ INDEX(T idx_t_xifenfei2_id)*/* from t_x 5291sfrd2p35y
ifenfei t where OBJECT_ID=10
select /*+ INDEX(T idx_t_xifenfei_id)*/* from t_xi 143q33ff4f06w
fenfei t where OBJECT_ID=10
select * from t_xifenfei t where OBJECT_ID=10      b5zuac0zqm9nw
--使用sqlprofile固定其他两个未使用index IDX_T_XIFENFEI2_ID的sql使用该索引
SQL> DECLARE
  2   SQL_FTEXT CLOB;
  3   BEGIN
  4   SELECT SQL_FULLTEXT INTO SQL_FTEXT FROM V$SQLAREA WHERE SQL_ID = 'b5zuac0zqm9nw';
  5   DBMS_SQLTUNE.IMPORT_SQL_PROFILE(
  6     SQL_TEXT => SQL_FTEXT,
  7     PROFILE => SQLPROF_ATTR('INDEX(@"SEL$1" "T"@"SEL$1" "IDX_T_XIFENFEI2_ID")'),
  8     NAME => 'PROFILE_b5zuac0zqm9nw',
  9     REPLACE => TRUE,
  10    FORCE_MATCH => TRUE
  11  );
 12   END;
 13   /
PL/SQL procedure successfully completed.
SQL>
SQL> DECLARE
  2    SQL_FTEXT CLOB;
  3   BEGIN
  4   SELECT SQL_FULLTEXT INTO SQL_FTEXT FROM V$SQLAREA WHERE SQL_ID = '143q33ff4f06w';
  5   DBMS_SQLTUNE.IMPORT_SQL_PROFILE(
  6     SQL_TEXT => SQL_FTEXT,
  7     PROFILE => SQLPROF_ATTR('INDEX(@"SEL$1" "T"@"SEL$1" "IDX_T_XIFENFEI2_ID")'),
  8      NAME => 'PROFILE_143q33ff4f06w',
  9     REPLACE => TRUE,
 10     FORCE_MATCH => TRUE
 11   );
 12   END;
13   /
PL/SQL procedure successfully completed.
--验证查询效果
SQL> SET LINES 150
SQL> SET AUTOT  ON
SQL> SET PAGES 150
SQL> select * from t_xifenfei t where OBJECT_ID=10;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2143066642
--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |     1 |    79 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T_XIFENFEI         |     1 |    79 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_T_XIFENFEI2_ID |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_ID"=10)
Note
-----
   - dynamic sampling used for this statement (level=2)
   - SQL profile "PROFILE_b5zuac0zqm9nw" used for this statement  <--使用sql profile
Statistics
----------------------------------------------------------
         37  recursive calls
          0  db block gets
         23  consistent gets
          1  physical reads
          0  redo size
        410  bytes sent via SQL*Net to client
        513  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          0  rows processed
SQL> select /*+ INDEX(T idx_t_xifenfei_id)*/* from t_xifenfei t where OBJECT_ID=10;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2143066642
--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |     1 |    79 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T_XIFENFEI         |     1 |    79 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_T_XIFENFEI2_ID |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_ID"=10)
Note
-----
   - dynamic sampling used for this statement (level=2)
   - SQL profile "PROFILE_143q33ff4f06w" used for this statement
<--使用sql profile,hint未被正常使用,证明sql profile影响hint,使得sql使用sql profile而不hint
Statistics
----------------------------------------------------------
         10  recursive calls
          0  db block gets
         16  consistent gets
          1  physical reads
          0  redo size
        410  bytes sent via SQL*Net to client
        513  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

这里可以知道,在有sqlprofile的情况下,可以影响以前的hint提示,使得hint无效,继续使用sql profile,感谢北京–weejar的试验证明

使用coe_load_sql_profile方式指定修改sql后的执行计划

SQL> @/tmp/coe_load_sql_profile.sql
Parameter 1:
ORIGINAL_SQL_ID (required)
Enter value for 1: b5zuac0zqm9nw
Parameter 2:
MODIFIED_SQL_ID (required)
Enter value for 2: 5291sfrd2p35y
     PLAN_HASH_VALUE          AVG_ET_SECS
-------------------- --------------------
          2143066642                 .004
Parameter 3:
PLAN_HASH_VALUE (required)
Enter value for 3: 2143066642
Values passed to coe_load_sql_profile:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ORIGINAL_SQL_ID: "b5zuac0zqm9nw"
MODIFIED_SQL_ID: "5291sfrd2p35y"
PLAN_HASH_VALUE: "2143066642"
SQL>BEGIN
  2    IF :sql_text IS NULL THEN
  3      RAISE_APPLICATION_ERROR(-20100, 'SQL_TEXT for original SQL_ID &&original_sql_id.
       was not found in memory (gv$sqltext_with_newlines) or AWR (dba_hist_sqltext).');
  4    END IF;
  5  END;
  6  /
SQL>SET TERM OFF;
SQL>BEGIN
  2    IF :other_xml IS NULL THEN
  3      RAISE_APPLICATION_ERROR(-20101, 'PLAN for modified SQL_ID &&modified_sql_id. and PHV &&plan_hash_value.
         was not found in memory (gv$sql_plan) or AWR (dba_hist_sql_plan).');
  4    END IF;
  5  END;
  6  /
SQL>
SQL>SET ECHO OFF;
0001 BEGIN_OUTLINE_DATA
0002 IGNORE_OPTIM_EMBEDDED_HINTS
0003 OPTIMIZER_FEATURES_ENABLE('11.2.0.4')
0004 DB_VERSION('11.2.0.4')
0005 ALL_ROWS
0006 OUTLINE_LEAF(@"SEL$1")
0007 INDEX_RS_ASC(@"SEL$1" "T"@"SEL$1" "IDX_T_XIFENFEI2_ID")
0008 END_OUTLINE_DATA
dropping staging table "STGTAB_SQLPROF_B5ZUAC0ZQM9NW"
creating staging table "STGTAB_SQLPROF_B5ZUAC0ZQM9NW"
packaging new sql profile into staging table "STGTAB_SQLPROF_B5ZUAC0ZQM9NW"
PROFILE_NAME
------------------------------
B5ZUAC0ZQM9NW_2143066642
SQL>REM
SQL>REM SQL Profile
SQL>REM ~~~~~~~~~~~
SQL>REM
SQL>SELECT signature, name, category, type, status
  2    FROM dba_sql_profiles WHERE name = :name;
           SIGNATURE NAME                           CATEGORY                       TYPE    STATUS
-------------------- ------------------------------ ------------------------------ ------- --------
 6715790053022671751 B5ZUAC0ZQM9NW_2143066642       DEFAULT                        MANUAL  ENABLED
SQL>SET ECHO OFF;
****************************************************************************
* Enter CHF password to export staging table STGTAB_SQLPROF_b5zuac0zqm9nw
****************************************************************************
Export: Release 11.2.0.4.0 - Production on Mon Jun 1 00:10:11 2015
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Password:
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in US7ASCII character set and AL16UTF16 NCHAR character set
server uses ZHS16GBK character set (possible charset conversion)
Note: grants on tables/views/sequences/roles will not be exported
Note: indexes on tables will not be exported
Note: constraints on tables will not be exported
About to export specified tables via Conventional Path ...
. . exporting table   STGTAB_SQLPROF_B5ZUAC0ZQM9NW          1 rows exported
Export terminated successfully without warnings.
If you need to implement this Custom SQL Profile on a similar system,
import and unpack using these commands:
imp CHF file=STGTAB_SQLPROF_b5zuac0zqm9nw.dmp tables=STGTAB_SQLPROF_b5zuac0zqm9nw ignore=Y
BEGIN
DBMS_SQLTUNE.UNPACK_STGTAB_SQLPROF (
profile_name => 'B5ZUAC0ZQM9NW_2143066642',
replace => TRUE,
staging_table_name => 'STGTAB_SQLPROF_b5zuac0zqm9nw',
staging_schema_owner => 'CHF' );
END;
/
updating: coe_load_sql_profile_b5zuac0zqm9nw.log (deflated 76%)
updating: STGTAB_SQLPROF_b5zuac0zqm9nw.dmp (deflated 89%)
  adding: coe_load_sql_profile.log (deflated 62%)
deleting: coe_load_sql_profile.log
coe_load_sql_profile completed.
SQL>SET LINES 150
SQL>SET AUTOT  ON
SQL>SET PAGES 150
select * from t_xifenfei t where OBJECT_ID=10;SQL>
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2143066642
--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |     6 |   474 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T_XIFENFEI         |     6 |   474 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_T_XIFENFEI2_ID |     2 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_ID"=10)
Note
-----
   - SQL profile "B5ZUAC0ZQM9NW_2143066642" used for this statement
<------sql直接使用coe_load_sql_profile固定执行计划成功
Statistics
----------------------------------------------------------
          7  recursive calls
          0  db block gets
          7  consistent gets
          1  physical reads
          0  redo size
        410  bytes sent via SQL*Net to client
        513  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

使用coe_load_sql_profile也可以sql_id中没有PLAN HASH VALUE的执行计划,另外还可以实现直接把sqlprofile直接迁移到其他库中

coe_xfr_sql_profile固定没有sql_id没有直接PLAN HASH VALUE的执行计划

SQL> @/tmp/coe_xfr_sql_profile
Parameter 1:
SQL_ID (required)
Enter value for 1: b5zuac0zqm9nw
PLAN_HASH_VALUE AVG_ET_SECS
--------------- -----------
      308895000        .005
Parameter 2:
PLAN_HASH_VALUE (required)
Enter value for 2: 2143066642     <---该PLAN_HASH_VALUE不存在该sql_id对应的PLAN_HASH_VALUE中
Values passed to coe_xfr_sql_profile:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SQL_ID         : "b5zuac0zqm9nw"
PLAN_HASH_VALUE: "2143066642"
SQL>BEGIN
  2    IF :sql_text IS NULL THEN
  3      RAISE_APPLICATION_ERROR(-20100, 'SQL_TEXT for SQL_ID &&sql_id. was not
      found in memory (gv$sqltext_with_newlines) or AWR (dba_hist_sqltext).');
  4    END IF;
  5  END;
  6  /
SQL>SET TERM OFF;
SQL>BEGIN
  2    IF :other_xml IS NULL THEN
  3      RAISE_APPLICATION_ERROR(-20101, 'PLAN for SQL_ID &&sql_id. and PHV &&plan_hash_value.
      was not found in memory (gv$sql_plan) or AWR (dba_hist_sql_plan).');
  4    END IF;
  5  END;
  6  /
SQL>SET TERM OFF;
Execute coe_xfr_sql_profile_b5zuac0zqm9nw_2143066642.sql
on TARGET system in order to create a custom SQL Profile
with plan 2143066642 linked to adjusted sql_text.
COE_XFR_SQL_PROFILE completed.
SQL>@coe_xfr_sql_profile_b5zuac0zqm9nw_2143066642.sql
SQL>REM
SQL>REM $Header: 215187.1 coe_xfr_sql_profile_b5zuac0zqm9nw_2143066642.sql 11.4.4.4 2015/06/01 carlos.sierra $
SQL>REM
SQL>REM Copyright (c) 2000-2012, Oracle Corporation. All rights reserved.
SQL>REM
SQL>REM AUTHOR
SQL>REM   carlos.sierra@oracle.com
SQL>REM
SQL>REM SCRIPT
SQL>REM   coe_xfr_sql_profile_b5zuac0zqm9nw_2143066642.sql
SQL>REM
SQL>REM DESCRIPTION
SQL>REM   This script is generated by coe_xfr_sql_profile.sql
SQL>REM   It contains the SQL*Plus commands to create a custom
SQL>REM   SQL Profile for SQL_ID b5zuac0zqm9nw based on plan hash
SQL>REM   value 2143066642.
SQL>REM   The custom SQL Profile to be created by this script
SQL>REM   will affect plans for SQL commands with signature
SQL>REM   matching the one for SQL Text below.
SQL>REM   Review SQL Text and adjust accordingly.
SQL>REM
SQL>REM PARAMETERS
SQL>REM   None.
SQL>REM
SQL>REM EXAMPLE
SQL>REM   SQL> START coe_xfr_sql_profile_b5zuac0zqm9nw_2143066642.sql;
SQL>REM
SQL>REM NOTES
SQL>REM   1. Should be run as SYSTEM or SYSDBA.
SQL>REM   2. User must have CREATE ANY SQL PROFILE privilege.
SQL>REM   3. SOURCE and TARGET systems can be the same or similar.
SQL>REM   4. To drop this custom SQL Profile after it has been created:
SQL>REM  EXEC DBMS_SQLTUNE.DROP_SQL_PROFILE('coe_b5zuac0zqm9nw_2143066642');
SQL>REM   5. Be aware that using DBMS_SQLTUNE requires a license
SQL>REM  for the Oracle Tuning Pack.
SQL>REM   6. If you modified a SQL putting Hints in order to produce a desired
SQL>REM  Plan, you can remove the artifical Hints from SQL Text pieces below.
SQL>REM  By doing so you can create a custom SQL Profile for the original
SQL>REM  SQL but with the Plan captured from the modified SQL (with Hints).
SQL>REM
SQL>WHENEVER SQLERROR EXIT SQL.SQLCODE;
SQL>REM
SQL>VAR signature NUMBER;
SQL>VAR signaturef NUMBER;
SQL>REM
SQL>DECLARE
  2  sql_txt CLOB;
  3  h       SYS.SQLPROF_ATTR;
  4  PROCEDURE wa (p_line IN VARCHAR2) IS
  5  BEGIN
  6  DBMS_LOB.WRITEAPPEND(sql_txt, LENGTH(p_line), p_line);
  7  END wa;
  8  BEGIN
  9  DBMS_LOB.CREATETEMPORARY(sql_txt, TRUE);
 10  DBMS_LOB.OPEN(sql_txt, DBMS_LOB.LOB_READWRITE);
 11  -- SQL Text pieces below do not have to be of same length.
 12  -- So if you edit SQL Text (i.e. removing temporary Hints),
 13  -- there is no need to edit or re-align unmodified pieces.
 14  wa(q'[select * from t_xifenfei t where OBJECT_ID=10]');
 15  DBMS_LOB.CLOSE(sql_txt);
 16  h := SYS.SQLPROF_ATTR(
 17  q'[BEGIN_OUTLINE_DATA]',
 18  q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
 19  q'[OPTIMIZER_FEATURES_ENABLE('11.2.0.4')]',
 20  q'[DB_VERSION('11.2.0.4')]',
 21  q'[ALL_ROWS]',
 22  q'[OUTLINE_LEAF(@"SEL$1")]',
 23  q'[INDEX_RS_ASC(@"SEL$1" "T"@"SEL$1" "IDX_T_XIFENFEI2_ID")]',
 24  q'[END_OUTLINE_DATA]');
 25  :signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
 26  :signaturef := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt, TRUE);
 27  DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
 28  sql_text    => sql_txt,
 29  profile     => h,
 30  name        => 'coe_b5zuac0zqm9nw_2143066642',
 31  description => 'coe b5zuac0zqm9nw 2143066642 '||:signature||' '||:signaturef||'',
 32  category    => 'DEFAULT',
 33  validate    => TRUE,
 34  replace     => TRUE,
 35  force_match => FALSE /* TRUE:FORCE (match even when different literals in SQL).
     FALSE:EXACT (similar to CURSOR_SHARING) */ );
 36  DBMS_LOB.FREETEMPORARY(sql_txt);
 37  END;
 38  /
PL/SQL procedure successfully completed.
SQL>WHENEVER SQLERROR CONTINUE
SQL>SET ECHO OFF;
            SIGNATURE
---------------------
  6715790053022671751
           SIGNATUREF
---------------------
   445801536248906164
... manual custom SQL Profile has been created
COE_XFR_SQL_PROFILE_b5zuac0zqm9nw_2143066642 completed
SQL>set autot on
SQL>set lines 150
SQL>set pages 150
SQL>select * from t_xifenfei t where OBJECT_ID=10;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2143066642
--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |     6 |   474 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T_XIFENFEI         |     6 |   474 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_T_XIFENFEI2_ID |     2 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_ID"=10)
Note
-----
   - SQL profile "coe_b5zuac0zqm9nw_2143066642" used for this statement
Statistics
----------------------------------------------------------
          7  recursive calls
          0  db block gets
          7  consistent gets
          1  physical reads
          0  redo size
        410  bytes sent via SQL*Net to client
        513  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed
SQL>

通过验证,证明在sql_id中没有对应的PLAN HASH VALUE之时,也可以通过coe_xfr_sql_profile指定PLAN HASH VALUE来固定某个sql_id的执行计划.
以前写过相关关于sql profile的文章:sql profile 使用,执行计划改变导致数据库负载过高