使用rman from service 搭建dataguard

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:使用rman from service 搭建dataguard

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

从oracle 12c开始提供了rman通过from service方式搭建dg,使用12c长期支持版19c(并打上最新的patch)
配置dataguard相关参数

alter system set db_unique_name='XIFENFEI' scope=spfile;
alter system set service_names='XIFENFEI';
alter system set log_archive_config='dg_config=(XIFENFEI,XIFENFEIDG)';
alter system set log_archive_dest_1='LOCATION=USE_DB_RECOVERY_FILE_DEST valid_for=(all_logfiles,all_roles) db_unique_name=XIFENFEI';
alter system set log_archive_dest_2='service=XIFENFEIDG lgwr async valid_for=(online_logfiles,primary_role) db_unique_name=XIFENFEIDG';
alter system set standby_file_management=auto;
alter system set db_file_name_convert='/u01/app/oracle/oradata/XIFENFEI/','/u01/app/oracle/oradata/XIFENFEI/' scope=spfile;
alter system set log_file_name_convert='/u01/app/oracle/oradata/XIFENFEI/','/u01/app/oracle/oradata/XIFENFEI/' scope=spfile;
alter system set fal_server=XIFENFEIDG;

配置tnsnames.ora

XIFENFEI =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.238)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = XIFENFEI)
    )
  )

XIFENFEIDG =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.124)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = XIFENFEI)
    )
  )

拷贝主库密码文件到备库

[oracle@primary ~]$ scp $ORACLE_HOME/dbs/orapwXIFENFEI 192.168.0.124:$ORACLE_HOME/dbs/
The authenticity of host '192.168.0.124 (192.168.0.124)' can't be established.
ECDSA key fingerprint is SHA256:NI2952z4Bqc3M/B+AK7EJRiJNauROIyluvu1l4NSTX0.
ECDSA key fingerprint is MD5:1d:64:dd:ef:1c:ad:ed:cf:70:22:2d:4d:7c:90:5e:5e.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.0.124' (ECDSA) to the list of known hosts.
oracle@192.168.0.124's password: 
orapwXIFENFEI                                                                   100% 2048     6.6MB/s   00:00    
[oracle@primary ~]$ 

备库启动到nomount状态

[oracle@standby ~]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Sat Mar 13 20:32:34 2021
Version 19.10.0.0.0

Copyright (c) 1982, 2020, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> create spfile from pfile='/tmp/pfile';

File created.

SQL> startup nomount pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area 4294963264 bytes
Fixed Size                  8904768 bytes
Variable Size             805306368 bytes
Database Buffers         3472883712 bytes
Redo Buffers                7868416 bytes
SQL> exit
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.10.0.0.0

rman from service方式创建standby ctl和还原数据文件

[oracle@standby ~]$ rman target /

Recovery Manager: Release 19.0.0.0.0 - Production on Sat Mar 13 20:34:37 2021
Version 19.10.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to target database: XIFENFEI (not mounted)

RMAN> restore standby controlfile from service XIFENFEI;

Starting restore at 13-MAR-21
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=9 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: using network backup set from service XIFENFEI
channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
output file name=/u01/app/oracle/oradata/XIFENFEI/control01.ctl
output file name=/u01/app/oracle/fast_recovery_area/XIFENFEI/control02.ctl
Finished restore at 13-MAR-21

RMAN> alter database mount;

released channel: ORA_DISK_1
Statement processed

RMAN> restore database from service XIFENFEI;

Starting restore at 13-MAR-21
Starting implicit crosscheck backup at 13-MAR-21
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=12 device type=DISK
Finished implicit crosscheck backup at 13-MAR-21

Starting implicit crosscheck copy at 13-MAR-21
using channel ORA_DISK_1
Finished implicit crosscheck copy at 13-MAR-21

searching for all files in the recovery area
cataloging files...
no files cataloged

using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: using network backup set from service XIFENFEI
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to /u01/app/oracle/oradata/XIFENFEI/system01.dbf
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: using network backup set from service XIFENFEI
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00003 to /u01/app/oracle/oradata/XIFENFEI/sysaux01.dbf
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: using network backup set from service XIFENFEI
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00004 to /u01/app/oracle/oradata/XIFENFEI/undotbs01.dbf
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: using network backup set from service XIFENFEI
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00007 to /u01/app/oracle/oradata/XIFENFEI/users01.dbf
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 13-MAR-21
[/sehll]
备库启动mrp进程
1
1
2021-03-13T20:54:08.075418+08:00
Attempt to start background Managed Standby Recovery process (XIFENFEI)
Starting background process MRP0
2021-03-13T20:54:08.086269+08:00
MRP0 started with pid=56, OS id=8182 
2021-03-13T20:54:08.087276+08:00
Background Managed Standby Recovery process started (XIFENFEI)
2021-03-13T20:54:13.104757+08:00
 Started logmerger process
2021-03-13T20:54:13.112058+08:00

IM on ADG: Start of Empty Journal 

IM on ADG: End of Empty Journal 
PR00 (PID:8188): Managed Standby Recovery starting Real Time Apply
2021-03-13T20:54:13.205668+08:00
Parallel Media Recovery started with 4 slaves
2021-03-13T20:54:13.216576+08:00
Stopping change tracking
PR00 (PID:8188): Media Recovery Waiting for T-1.S-25 (in transit)
2021-03-13T20:54:13.269138+08:00
Recovery of Online Redo Log: Thread 1 Group 12 Seq 25 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/XIFENFEI/s_redo12.log

至此dataguard基本上搭建完成

ORA-04020导致adg异常

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-04020导致adg异常

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今日早上有客户反馈adg停止同步了,通过检查alert日志发现

Tue Dec 24 18:17:41 2019
Media Recovery Waiting for thread 1 sequence 56655 (in transit)
Recovery of Online Redo Log: Thread 1 Group 11 Seq 56655 Reading mem 0
  Mem# 0: Y:\ORACLE\ORADATA\ORACLE11\STD_REDO11.LOG
Archived Log entry 56248 added for thread 1 sequence 56654 ID 0x5b6bcf9b dest 1:
Tue Dec 24 18:18:11 2019
Errors in file C:\APP\ADMINISTRATOR\diag\rdbms\oracle11dg\oracle11\trace\oracle11_lgwr_3252.trc:
ORA-04020: deadlock detected while trying to lock object SYS.orcl
LGWR (ospid: 3252): terminating the instance due to error 4020
Tue Dec 24 18:18:11 2019
System state dump requested by (instance=1, osid=3252 (LGWR)), summary=[abnormal instance termination].
System State dumped to trace file C:\APP\ADMINISTRATOR\diag\rdbms\oracle11dg\trace\oracle11_diag_3236_20191224181811.trc
Dumping diagnostic data in directory=[cdmp_20191224181811],
      requested by (instance=1, osid=3252 (LGWR)), summary=[abnormal instance termination].
Instance terminated by LGWR, pid = 3252

由于lgwr进程遭遇ORA-04020,从而使得lgwr进程异常,进而整个数据库crash.

分析trace文件

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Windows NT Version V6.1 Service Pack 1
CPU                 : 8 - type 8664, 8 Physical Cores
Process Affinity    : 0x0x0000000000000000
Memory (Avail/Total): Ph:8395M/32733M, Ph+PgF:41002M/65464M
Instance name: oracle11
Redo thread mounted by this instance: 1
Oracle process number: 10
Windows thread id: 3252, image: ORACLE.EXE (LGWR)
*** 2019-12-24 18:18:11.072
*** SESSION ID:(384.1) 2019-12-24 18:18:11.072
*** CLIENT ID:() 2019-12-24 18:18:11.072
*** SERVICE NAME:(SYS$BACKGROUND) 2019-12-24 18:18:11.072
*** MODULE NAME:() 2019-12-24 18:18:11.072
*** ACTION NAME:() 2019-12-24 18:18:11.072
A deadlock among DDL and parse locks is detected.
This deadlock is usually due to user errors in
the design of an application or from issuing a set
of concurrent statements which can cause a deadlock.
This should not be reported to Oracle Support.
The following information may aid in finding
the errors which cause the deadlock:
ORA-04020: deadlock detected while trying to lock object SYS.orcl
--------------------------------------------------------
  object   waiting  waiting       blocking blocking
  handle   session     lock mode   session     lock mode
--------  -------- -------- ----  -------- -------- ----
0000000676C20F08  000000066D22BE10 00000006738AB970    X  000000066D22BE10 00000006738A04B0    S
0000000677DF2E80  00000006792E2880 0000000673B13AE8    X  000000066D22BE10 00000006738A19B8    S
--------------------------------------------------------
---------- DUMP OF WAITING AND BLOCKING LOCKS ----------
--------------------------------------------------------
------------- WAITING LOCK -------------
----------------------------------------
SO: 0x00000006738AB970, type: 78, owner: 0x000000065D440498, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
 proc=0x000000066CDE7AD0, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8751, pg=0
LibraryObjectLock:Address=00000006738AB970 Handle=0000000676C20F08
  RequestMode=X CanBeBrokenCount=2 Incarnation=3 ExecutionCount=0
………………
SO: 0x00000006738A19B8, type: 78, owner: 0x000000065A38D6C0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
 proc=0x000000066CDE7AD0, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8751, pg=0
LibraryObjectLock:Address=00000006738A19B8 Handle=0000000677DF2E80
   Mode=S CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0
  User=000000066D22BE10 Session=000000066D22BE10 ReferenceCount=1 Flags=CNB/[0001] SavepointNum=1b749
  LibraryHandle:  Address=0000000677DF2E80 Hash=675351da LockMode=S PinMode=0 LoadLockMode=0 Status=0
  ObjectName:  Name=SYS.orcl
    FullHashValue=285b654fe3f440652c403c98675351da Namespace=DBINSTANCE(74) Type=CURSOR(00) Identifier=1 OwnerIdn=0
  Statistics:  InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=74719 TotalPinCount=0
  Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=0 Version=0
        BucketInUse=2 HandleInUse=2 HandleReferenceCount=0
  Concurrency:  DependencyMutex=0000000677DF2F30(0, 0, 0, 0) Mutex=0000000677DF2FC0(0, 149471, 1, 0)
  Flags=RON/PIN/KEP/BSO/[00810003]
  WaitersLists:
    Lock=0000000677DF2F10[0000000673B13B58,000000067382E2F0]
    Pin=0000000677DF2EF0[0000000677DF2EF0,0000000677DF2EF0]
    LoadLock=0000000677DF2F68[0000000677DF2F68,0000000677DF2F68]
  Timestamp:
  HandleReference:  Address=0000000677DF3030 Handle=0000000000000000 Flags=[00] ---------------------------------
This lock request was aborted.
error 4020 detected in background process
ORA-04020: deadlock detected while trying to lock object SYS.orcl
kjzduptcctx: Notifying DIAG for crash event
----- Abridged Call Stack Trace -----
ksedsts()+585<-kjzdssdmp()+329<-kjzduptcctx()+288<-kjzdicrshnfy()+99<-ksuitm()+1525<-ksbrdp()+4578<-opirip()
+853<-opidrv()+906<-sou2o()+98<-opimai_real()+280<-opimai()+191
    <-BackgroundThreadStart()+646<-0000000076CF59CD<-0000000076E2A561
----- End of Abridged Call Stack Trace -----
*** 2019-12-24 18:18:11.165
LGWR (ospid: 3252): terminating the instance due to error 4020
*** 2019-12-24 18:18:17.483
ksuitm: waiting up to [5] seconds before killing DIAG(3236)

日志显示由于lgwr进程等待LIBRARY OBJECT LOCK超时,从而引起异常,根据经验此类问题一般是由于bug导致,查询mos发现匹配bug信息Bug 18515268 ORA-4020 in ADG Standby Database causing instance crash by LGWR
20191225112622


可以根据需要打上相关Patch 18515268: ACTIVE DATAGUARD STANDBY CRASHES DUE TO AN ORA-4020 ENCOUNTERED BY LGWR

oracle active dataguard修改密码备库延迟生效

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:oracle active dataguard修改密码备库延迟生效

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在oracle dataguard环境中,当主库修改密码之后,备库不会立即生效,需要flush shared pool之后才会生效
主库创建用户并尝试登录

[oracle@primary ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Tue Sep 3 17:24:12 2019
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> select open_mode,database_role from v$database;
OPEN_MODE            DATABASE_ROLE
-------------------- ----------------
READ WRITE           PRIMARY
SQL> create user xff identified by oracle;
User created.
SQL> grant dba to xff;
Grant succeeded.
SQL> conn xff/oracle
Connected.

备库登录信息

[oracle@standby ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Tue Sep 3 17:17:47 2019
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select database_role,open_mode from v$database;
DATABASE_ROLE    OPEN_MODE
---------------- --------------------
PHYSICAL STANDBY READ ONLY WITH APPLY
[oracle@standby ~]$ ss
SQL*Plus: Release 11.2.0.4.0 Production on Tue Sep 3 17:25:58 2019
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> conn xff/oracle
Connected.

主库修改密码

[oracle@primary ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Tue Sep 3 17:17:47 2019
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL>  alter user xff identified by xff;
User altered.
SQL> conn xff/xff
Connected.

备库尝试登录

SQL> conn xff/xff;
ERROR:
ORA-01017: invalid username/password; logon denied
Warning: You are no longer connected to ORACLE.
SQL> conn xff/oracle
Connected.

备库原密码可以登录修改之后的密码无法登录
备库刷新shared pool,新密码登录成功

SQL> conn / as sysdba
Connected.
SQL> alter system flush shared_pool;
System altered.
SQL> conn xff/oracle
ERROR:
ORA-01017: invalid username/password; logon denied
Warning: You are no longer connected to ORACLE.
SQL> conn xff/xff
Connected.

12.2 standby 报ORA-01110

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:12.2 standby 报ORA-01110

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

12.2备库报错

2018-06-13T19:29:00.302767+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 1: '/u01/app/oracle/oradata/xifenfei/system01.dbf'
2018-06-13T19:29:00.829861+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 2: '/u01/app/oracle/oradata/xifenfei/rich101.dbf'
2018-06-13T19:29:00.930632+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 3: '/u01/app/oracle/oradata/xifenfei/sysaux01.dbf'
2018-06-13T19:29:01.010230+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 4: '/u01/app/oracle/oradata/xifenfei/undotbs01.dbf'
2018-06-13T11:29:01.055975+00:00
Archived Log entry 5072 added for T-1.S-5020 ID 0x6a8e9d72 LAD:1
RFS[18]: Selected log 10 for T-1.S-5024 dbid 1787743346 branch 957530932
2018-06-13T19:29:01.091059+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 5: '/u01/app/oracle/oradata/xifenfei/richman01.dbf'
2018-06-13T19:29:01.172613+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 7: '/u01/app/oracle/oradata/xifenfei/users01.dbf'
2018-06-13T19:29:01.251906+08:00
Errors in file /u01/app/oracle/diag/rdbms/xifenfeidg/xifenfei/trace/xifenfei_m000_2457.trc:
ORA-01110: data file 8: '/u01/app/oracle/oradata/xifenfei/r_index01.dbf'

trace文件

*** 2018-06-13T19:29:00.282836+08:00
*** SESSION ID:(2281.15120) 2018-06-13T19:29:00.282868+08:00
*** CLIENT ID:() 2018-06-13T19:29:00.282873+08:00
*** SERVICE NAME:(SYS$BACKGROUND) 2018-06-13T19:29:00.282878+08:00
*** MODULE NAME:(MMON_SLAVE) 2018-06-13T19:29:00.282883+08:00
*** ACTION NAME:(DDE async action) 2018-06-13T19:29:00.282888+08:00
*** CLIENT DRIVER:() 2018-06-13T19:29:00.282892+08:00
========= Dump for error ORA 312 (no incident) ========
----- DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (Async) -----
dbkh_reactive_run_check: BEGIN
dbkh_reactive_run_check:; incident_id=0
dbkh_run_check_internal: BEGIN; check_namep=DB Structure Integrity Check, run_namep=<null>
dbkh_run_check_internal: BEGIN; timeout=0
dbkh_run_check_internal: AFTER RUN CREATE; run_id=1841
*** 2018-06-13T19:29:00.302510+08:00
DDE previous invocation failed before phase II
DDE was called in a 'No Invocation Mode'
----- Start Diag Diagnostic Dump -----
Diagnostic dump is performed due to an error in the diagfw code during error handling.
Dump error and call stack for the diagnostic dump:
*** 2018-06-13T19:29:00.302576+08:00
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=1, mask=0x0)
----- Error Stack Dump -----
ORA-01110: data file 1: '/u01/app/oracle/oradata/xifenfei/system01.dbf'
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+119         call     kgdsdst()            7FFF1A0D6C68 000000002
                                                   7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 000000082 ?
dbkedDefDump()+1200  call     ksedst()             000000000 000000002 ?
                                                   7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
ksedmp()+259         call     dbkedDefDump()       000000001 000000000
                                                   7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
dbgexExecuteIntDiag  call     ksedmp()             000000001 000000000 ?
Dmp()+1457                                         7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
dbgeBeginInvoke()+3  call     dbgexExecuteIntDiag  7F5A00000003 7F5A99B856C0
59                            Dmp()                7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
dbgePostErrorKGE()+  call     dbgeBeginInvoke()    7F5A99B856C0 7FFF1A0D7D20
1676                                               7FFF1A0B86D0 ? 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   7F5A99BC59A0 7F5A99AA0048
90                                                 000000456 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
kgeade()+432         call     dbkePostKGE_kgsf()   7F5A99BC59A0 7F5A99AA0048
                                                   000000456 7FFF1A0B87E8 ?
                                                   000000000 ? 000000082 ?
kgerelv()+144        call     kgeade()             7F5A99BC59A0 ? 7F5A99BC5BE8 ?
                                                   7F5A99AA0048 ? 000000456 ?
                                                   000000000 000000000
kgerev()+36          call     kgerelv()            7F5A99BC59A0 ? 7F5A99AA0048 ?
                                                   7F5A99AA0048 ? 000000456 ?
                                                   012E79CF4 ? 000000002 ?
kserec2()+185        call     kgerev()             7F5A99BC59A0 ? 7F5A99AA0048 ?
                                                   7F5A99AA0048 ? 000000456 ?
                                                   7FFF1A0D8000 000000002 ?
kcf_record_fn()+634  call     kserec2()            7F5A99BC59A0 ? 000000000
                                                   000000001 000000001 00000002C
                                                   141E0C518
kcvvra_dfh()+5278    call     kcf_record_fn()      000000001 151622BB8 000000000
                                                   7FFF1A0DA5D8 00000002C ?
                                                   141E0C518 ?
kcidr_file_header_c  call     kcvvra_dfh()         7FFF1A0DA460 ? 7FFF1A0D9FE8 ?
heck_common()+4669                                 000000000 ? 7FFF1A0D9398
                                                   7F5A94379000 ? 000000001 ?
kcidr_file_header_a  call     kcidr_file_header_c  7F5A99A9F7A0 7F5A94379000
ll_check_common()+2           heck_common()        000000001 000000000
259                                                7F5A94379000 ? 000000000
kcidr_cross_check()  call     kcidr_file_header_a  7F5A99A9F7A0 7FFF1A0DABE4
+566                          ll_check_common()    000000001 ? 000000000 ?
                                                   7F5A94379000 ? 000000000 ?
dbkird_cross_check(  call     kcidr_cross_check()  7F5A99A9F7A0 7FFF1A0DABE4 ?
)+557                                              7F5A99BC5BE8 000000000 ?
                                                   7F5A94379000 ? 000000000 ?
dbkh_run_check_inte  call     dbkird_cross_check(  7F5A99A9F7A0 7FFF1A0DABE4 ?
rnal()+2228                   )                    7F5A99BC5BE8 ? 000000000 ?
                                                   7F5A94379000 ? 000000000 ?
dbkh_reactive_run_c  call     dbkh_run_check_inte  7FFF1A0DB970 000000000
heck()+3011                   rnal()               000000002 000000000 000000000
                                                   000000000
dbgdaAsyncReceive()  call     dbkh_reactive_run_c  7F5A99B856C0 7FFF1A0DBC90
+279                          heck()               000000002 ? 000000000 ?
                                                   000000000 ? 000000000 ?
dbgea_exec_()+1739   call     dbgdaAsyncReceive()  7F5A99B856C0 0020C0029
                                                   7FFF1A0E7CA0 7FFF1A0E7D20
                                                   000000002 000000000 ?
dbgea_exec()+621     call     dbgea_exec_()        7F5A99B856C0 7F5A94984D18
                                                   0000000E8 000000000
                                                   000000002 ? 000000000 ?
dbkea_exec()+1718    call     dbgea_exec()         7F5A99B856C0 7F5A94984D18
                                                   0000000E8 000000000
                                                   000000002 ? 000000000 ?
dbkea_slave_exec()+  call     dbkea_exec()         7F5A99B856C0 ? 7F5A94984D18 ?
518                                                0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
kebm_slave_cb()+64   call     dbkea_slave_exec()   1453D7248 7F5A94984D18 ?
                                                   0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
kebm_slave_main()+7  call     kebm_slave_cb()      1453D7248 ? 7F5A94984D18 ?
72                                                 0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
ksvrdp_int()+2010    call     kebm_slave_main()    1453D7248 ? 1453D7248
                                                   0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
opirip()+602         call     ksvrdp_int()         000000000 ? 000000000 ?
                                                   0000000E8 ? 000000000 ?
                                                   000000002 ? 000000000 ?
opidrv()+602         call     opirip()             000000032 000000004
                                                   7FFF1A0EAD98 000000000 ?
                                                   000000002 ? 000000000 ?
sou2o()+145          call     opidrv()             000000032 000000004
                                                   7FFF1A0EAD98 000000000 ?
                                                   000000002 ? 000000000 ?
opimai_real()+202    call     sou2o()              7FFF1A0EAD70 000000032
                                                   000000004 7FFF1A0EAD98
                                                   000000002 ? 000000000 ?
ssthrdmain()+417     call     opimai_real()        000000000 7FFF1A0EB080
                                                   000000004 ? 7FFF1A0EAD98 ?
                                                   000000002 ? 000000000 ?
main()+262           call     ssthrdmain()         000000000 000000003
                                                   7FFF1A0EB080 000000001
                                                   000000000 000000000 ?
__libc_start_main()  call     main()               000000000 7FFF1A0EB2B8
+245                                               7FFF1A0EB080 ? 000000001 ?
                                                   000000000 ? 000000000 ?
_start()+41          call     __libc_start_main()  000D05240 000000001
                                                   7FFF1A0EB2B8 7F5A95015C05 ?
                                                   000000000 ? 000000000 ?
--------------------- Binary Stack Dump ---------------------

BUG:24844841 – PHSB:CDB M000 REPORTED ORA-1110 ON ADG WHEN A DATAFILE IS ADDED ON PRIMARY
@ The M000 messages is a false alarm as well. It is a false alarm by DRA check
@ that doesn’t consider standby media recovery properly. Adding a file happens
@ to trigger the timing for the false alarm.
@ One way to fix this is to skip file header check if standby recovery is
@ running inside kcidr_file_header_all_check_common.
M000进程检查数据库文件头信息,由于bug原因报ORA-01110错误.

处理建议
1.打上补丁24844841
2.19.1版本修复该问题
3.重启备库,启动mgr
4.暂时忽略该问题(目前没有发现影响数据库同步)
参考:ORA-01110 For All Files In Standby Database (Doc ID 2322290.1)

dataguard配合flashback实现主备任意切换(failover和switchover)

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:dataguard配合flashback实现主备任意切换(failover和switchover)

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户使用本地和公有云通过vpn搭建数据库容灾,其中有一个需求,当本地环境出现问题云端容灾库接管业务,当本地环境恢复之后,本地继续做为主库,云端作为备库.我这里使用oracle dataguard结合flashback模拟实现客户需求(也可以在failover之后通过重新搭建dg实现类似需求,具体需要看客户的实际场景:数据量,带宽,恢复时间,技术能力等)
正常dg同步的主备环境
这里ora11g主机是主库,ora10g主机是备库

--数据库版本
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE    11.2.0.4.0      Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production
---主库(ora11g)
SQL> select open_mode,database_role,(select HOST_NAME from v$instance) HOST_NAME,flashback_on from v$database;
OPEN_MODE                      DATABASE_ROLE                  HOST_NAME                      FLASHBACK_
------------------------------ ------------------------------ ------------------------------ ----------
READ WRITE                     PRIMARY                        ora11g                         YES
SQL> create  table xff.t_xifenfei as select * from dba_objects;
Table created.
SQL> select count(*) from xff.t_xifenfei;
  COUNT(*)
----------
     86348
---备库(ora10g)
SQL> select open_mode,database_role,(select HOST_NAME from v$instance) HOST_NAME,flashback_on from v$database;
OPEN_MODE                      DATABASE_ROLE                  HOST_NAME                      FLASHBACK_
------------------------------ ------------------------------ ------------------------------ ----------
READ ONLY WITH APPLY           PHYSICAL STANDBY               ora10g                         YES
SQL> select count(*) from xff.t_xifenfei;
  COUNT(*)
----------
     86348

模拟主库(ora11g)故障,直接激活备库(ora10g)
模拟dg备库(ora10g)直接failover操作

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Database altered.
SQL> ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE;
Database altered.
Sat Mar 24 16:05:40 2017
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /u01/app/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_pr00_13428.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 1470499
Sat Mar 24 16:05:40 2017
MRP0: Background Media Recovery process shutdown (ora11g)
Managed Standby Recovery Canceled (ora11g)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
Sat Mar 24 16:05:50 2017
ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE
ALTER DATABASE ACTIVATE [PHYSICAL] STANDBY DATABASE (ora11g)
All dispatchers and shared servers shutdown
CLOSE: killing server sessions.
CLOSE: all sessions shutdown successfully.
Sat Mar 24 16:05:50 2017
SMON: disabling cache recovery
Killing 3 processes with pids 13372,13393,13388 (all RFS) in order to
    disallow current and future RFS connections. Requested by OS process 13350
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
RESETLOGS after incomplete recovery UNTIL CHANGE 1470499
Archived Log entry 9 added for thread 1 sequence 14 ID 0xfd5a5fc1 dest 1:
Resetting resetlogs activation ID 4250558401 (0xfd5a5fc1)
Online log /u01/app/oracle/oradata/ora11g/redo01.log: Thread 1 Group 1 was previously cleared
Online log /u01/app/oracle/oradata/ora11g/redo02.log: Thread 1 Group 2 was previously cleared
Online log /u01/app/oracle/oradata/ora11g/redo03.log: Thread 1 Group 3 was previously cleared
Standby became primary SCN: 1470497
Sat Mar 24 16:05:52 2017
Setting recovery target incarnation to 4
ACTIVATE STANDBY: Complete - Database mounted as primary
Completed: ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE

模拟新主库(ora10g)业务操作

SQL> select open_mode,database_role,(select HOST_NAME from v$instance) HOST_NAME,flashback_on from v$database;
OPEN_MODE                      DATABASE_ROLE                  HOST_NAME                      FLASHBACK_
------------------------------ ------------------------------ ------------------------------ ----------
MOUNTED                        PRIMARY                        ora10g                         YES
SQL> alter database open;
Database altered.
SQL>  select open_mode,database_role,(select HOST_NAME from v$instance) HOST_NAME,flashback_on from v$database;
OPEN_MODE                      DATABASE_ROLE                  HOST_NAME                      FLASHBACK_
------------------------------ ------------------------------ ------------------------------ ----------
READ WRITE                     PRIMARY                        ora10g                         YES
SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> /
/
System altered.
SQL>
System altered.
SQL>
SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     5
Next log sequence to archive   7
Current log sequence           7
SQL> delete from xff.t_xifenfei;
86348 rows deleted.
SQL> commit;
Commit complete.
SQL> select count(*) from xff.t_xifenfei;
  COUNT(*)
----------
         0

原主库(ora11g)配置为新备库

---原主库(ora10g)
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.
Total System Global Area 3056513024 bytes
Fixed Size                  2257152 bytes
Variable Size             704646912 bytes
Database Buffers         2332033024 bytes
Redo Buffers               17575936 bytes
Database mounted.
SQL> Flashback database to scn 1470490;
Flashback complete.
SQL> select open_mode,database_role,(select HOST_NAME from v$instance) HOST_NAME,flashback_on from v$database;
OPEN_MODE                      DATABASE_ROLE                  HOST_NAME                      FLASHBACK_
------------------------------ ------------------------------ ------------------------------ ----------
MOUNTED                        PRIMARY                        ora11g                         YES
--现在主库(ora10g)
SQL> alter database create standby controlfile as '/tmp/ctl.3';
Database altered.
[oracle@ora10g ~]$ scp /tmp/ctl.3 192.168.222.11:/u01/app/oracle/oradata/ora11g/control01.ctl
oracle@192.168.222.11's password:
ctl.3
--现在备库(ora11g)
SQL> shutdown abort
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.
Total System Global Area 3056513024 bytes
Fixed Size                  2257152 bytes
Variable Size             704646912 bytes
Database Buffers         2332033024 bytes
Redo Buffers               17575936 bytes
Database mounted.
SQL> col open_mode for a30
SQL> col database_role for a30
SQL> col HOST_NAME for a30
SQL> col flashback_on for a10
SQL> set lines 150
SQL> select open_mode,database_role,(select HOST_NAME from v$instance) HOST_NAME,flashback_on from v$database;
OPEN_MODE                      DATABASE_ROLE                  HOST_NAME                      FLASHBACK_
------------------------------ ------------------------------ ------------------------------ ----------
MOUNTED                        PHYSICAL STANDBY               ora11g                         NO
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION;
Database altered.
Sat Mar 24 16:10:09 2017
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Attempt to start background Managed Standby Recovery process (ora11g)
Sat Mar 24 16:10:09 2017
MRP0 started with pid=27, OS id=27086
MRP0: Background Managed Standby Recovery process started (ora11g)
 started logmerger process
Sat Mar 24 16:10:14 2017
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 4 slaves
Media Recovery start incarnation depth : 1, target inc# : 4, irscn : 1470499
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Clearing online redo logfile 1 /u01/app/oracle/oradata/ora11g/redo01.log
Clearing online log 1 of thread 1 sequence number 7
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 /u01/app/oracle/oradata/ora11g/redo02.log
Clearing online log 2 of thread 1 sequence number 8
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Clearing online redo logfile 2 complete
Clearing online redo logfile 3 /u01/app/oracle/oradata/ora11g/redo03.log
Clearing online log 3 of thread 1 sequence number 9
Clearing online redo logfile 3 complete
Media Recovery Waiting for thread 1 sequence 13 branch(resetlogs_id) 854379205
Sat Mar 24 16:10:16 2017
RFS[3]: Assigned to RFS process 27098
RFS[3]: Opened log for thread 1 sequence 13 dbid 48871109 branch 854379205
Archived Log entry 2 added for thread 1 sequence 13 rlc 854379205 ID 0xfd5a5fc1 dest 2:
Sat Mar 24 16:10:16 2017
RFS[4]: Assigned to RFS process 27100
RFS[4]: Opened log for thread 1 sequence 14 dbid 48871109 branch 854379205
Archived Log entry 3 added for thread 1 sequence 14 rlc 854379205 ID 0xfd5a5fc1 dest 2:
RFS[3]: Opened log for thread 1 sequence 1 dbid 48871109 branch 947798452
RFS[4]: Opened log for thread 1 sequence 3 dbid 48871109 branch 947798452
Sat Mar 24 16:10:16 2017
RFS[5]: Assigned to RFS process 27070
RFS[5]: Opened log for thread 1 sequence 2 dbid 48871109 branch 947798452
Archived Log entry 4 added for thread 1 sequence 1 rlc 947798452 ID 0x2ec1ed0 dest 2:
Archived Log entry 5 added for thread 1 sequence 3 rlc 947798452 ID 0x2ec1ed0 dest 2:
RFS[3]: Opened log for thread 1 sequence 4 dbid 48871109 branch 947798452
RFS[4]: Opened log for thread 1 sequence 5 dbid 48871109 branch 947798452
Archived Log entry 6 added for thread 1 sequence 5 rlc 947798452 ID 0x2ec1ed0 dest 2:
Archived Log entry 7 added for thread 1 sequence 4 rlc 947798452 ID 0x2ec1ed0 dest 2:
Archived Log entry 8 added for thread 1 sequence 2 rlc 947798452 ID 0x2ec1ed0 dest 2:
RFS[4]: Opened log for thread 1 sequence 6 dbid 48871109 branch 947798452
Archived Log entry 9 added for thread 1 sequence 6 rlc 947798452 ID 0x2ec1ed0 dest 2:
RFS[3]: Opened log for thread 1 sequence 7 dbid 48871109 branch 947798452
Archived Log entry 10 added for thread 1 sequence 7 rlc 947798452 ID 0x2ec1ed0 dest 2:
Media Recovery Log /u01/app/oracle/fast_recovery_area/ORA11GDG/archivelog/2017_03_24/o1_mf_1_13_9xmyh8cs_.arc
Media Recovery Log /u01/app/oracle/fast_recovery_area/ORA11GDG/archivelog/2017_03_24/o1_mf_1_14_9xmyh8dr_.arc
Media Recovery Log /u01/app/oracle/fast_recovery_area/ORA11GDG/archivelog/2017_03_24/o1_mf_1_1_9xmyh8fk_.arc
Media Recovery Log /u01/app/oracle/fast_recovery_area/ORA11GDG/archivelog/2017_03_24/o1_mf_1_2_9xmyh8fo_.arc
Media Recovery Log /u01/app/oracle/fast_recovery_area/ORA11GDG/archivelog/2017_03_24/o1_mf_1_3_9xmyh8fm_.arc
Media Recovery Log /u01/app/oracle/fast_recovery_area/ORA11GDG/archivelog/2017_03_24/o1_mf_1_4_9xmyh8g3_.arc
Media Recovery Log /u01/app/oracle/fast_recovery_area/ORA11GDG/archivelog/2017_03_24/o1_mf_1_5_9xmyh8g4_.arc
Media Recovery Log /u01/app/oracle/fast_recovery_area/ORA11GDG/archivelog/2017_03_24/o1_mf_1_6_9xmyh8gs_.arc
Media Recovery Log /u01/app/oracle/fast_recovery_area/ORA11GDG/archivelog/2017_03_24/o1_mf_1_7_9xmyh8hl_.arc
Media Recovery Log /u01/app/oracle/fast_recovery_area/ORA11GDG/archivelog/2017_03_24/o1_mf_1_8_9xmy5d3f_.arc
Media Recovery Waiting for thread 1 sequence 9 (in transit)
Recovery of Online Redo Log: Thread 1 Group 4 Seq 9 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/ora11g/std_redo10.log
SQL>  ALTER DATABASE RECOVER MANAGED STANDBY DATABASE cancel;
Database altered.
SQL> alter database open;
Database altered.
SQL>  ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION;
Database altered.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE cancel;
Database altered.
SQL> alter database flashback on;
Database altered.
SQL> select open_mode,database_role,(select HOST_NAME from v$instance) HOST_NAME,flashback_on from v$database;
OPEN_MODE                      DATABASE_ROLE                  HOST_NAME                      FLASHBACK_
------------------------------ ------------------------------ ------------------------------ ----------
READ ONLY                      PHYSICAL STANDBY               ora11g                         YES
SQL> select count(*) from xff.t_xifenfei;
  COUNT(*)
----------
         0
--现在主库
SQL> drop table xff.t_xifenfei ;
Table dropped.
SQL> select count(*) from xff.t_xifenfei;
select count(*) from xff.t_xifenfei
                         *
ERROR at line 1:
ORA-00942: table or view does not exist
--现在备库
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION;
Database altered.
SQL> select open_mode,database_role,(select HOST_NAME from v$instance) HOST_NAME,flashback_on from v$database;
OPEN_MODE                      DATABASE_ROLE                  HOST_NAME                      FLASHBACK_
------------------------------ ------------------------------ ------------------------------ ----------
READ ONLY WITH APPLY           PHYSICAL STANDBY               ora11g                         YES
SQL> select count(*) from xff.t_xifenfei;
select count(*) from xff.t_xifenfei
                         *
ERROR at line 1:
ORA-00942: table or view does not exist

通过上述一系列操作,以前假设故障的主库,现在变成了被failover激活的主库的备库,也就是说数据库主备关系由ora11g主库—>ora10g备库变成了ora10g主库—>ora11g备库

switchover实现主备库互换

---现在主库(ora10g)
SQL>  ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN;
Database altered.
--现在备库(ora11g)
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
Database altered.
SQL>  select open_mode,database_role,(select HOST_NAME from v$instance) HOST_NAME,flashback_on from v$database;
OPEN_MODE                      DATABASE_ROLE                  HOST_NAME                      FLASHBACK_
------------------------------ ------------------------------ ------------------------------ ----------
MOUNTED                        PRIMARY                        ora11g                         YES
SQL> alter database open;
Database altered.
SQL> select open_mode,database_role,(select HOST_NAME from v$instance) HOST_NAME,flashback_on from v$database;
OPEN_MODE                      DATABASE_ROLE                  HOST_NAME                      FLASHBACK_
------------------------------ ------------------------------ ------------------------------ ----------
READ WRITE                     PRIMARY                        ora11g                         YES
--最新备库(ora10g)
SQL> startup
ORACLE instance started.
Total System Global Area 3056513024 bytes
Fixed Size                  2257152 bytes
Variable Size             687869696 bytes
Database Buffers         2348810240 bytes
Redo Buffers               17575936 bytes
Database mounted.
Database opened.
SQL>  ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION;
Database altered.
SQL> select open_mode,database_role,(select HOST_NAME from v$instance) HOST_NAME,flashback_on from v$database;
OPEN_MODE                      DATABASE_ROLE                  HOST_NAME                      FLASHBACK_
------------------------------ ------------------------------ ------------------------------ ----------
READ ONLY WITH APPLY           PHYSICAL STANDBY               ora10g                         YES
--最新主库(ora11g)
SQL> alter system switch logfile;
System altered.
SQL> create table t_xifenfei as select * from dba_objects;
Table created.
SQL> select count(*) from t_xifenfei;
  COUNT(*)
----------
     86347
--最新备库(ora10g)
SQL>  select count(*) from t_xifenfei;
  COUNT(*)
----------
     86347

通过switchover主备库再次互换由ora10g主库—>ora11g备库变成了ora11g主库—>ora10g备库,再次恢复到最初的状态.这个不是唯一的方法,可以通过重建dg,也能够实现类似需求.

Oracle 12c active dataguard switchover

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:Oracle 12c active dataguard switchover

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

从12.1开始adg的切换发生了一些改变,直接使用alter database switchover to [target standby db_unique_name] verify; alter database switchover to [target standby db_unique_name]; 即可完成切换,以下是一次生产环境的具体操作步骤
主库操作

SQL> select database_role from v$database;
DATABASE_ROLE
----------------
PRIMARY
SQL> alter database switchover to xifenfei verify;
数据库已更改。
--alert日志
Sun Jun 25 09:07:08 2017
diag_adl:SWITCHOVER VERIFY: Send VERIFY request to switchover target xifenfei
diag_adl:SWITCHOVER VERIFY COMPLETE
SQL> alter database switchover to xifenfei;
数据库已更改。
--alert日志
Sun Jun 25 09:07:46 2017
diag_adl:Starting switchover [Process ID: 37024]
Sun Jun 25 09:07:46 2017
diag_adl:ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY [Process Id: 37024] (xifenfei)
diag_adl:Waiting for target standby to receive all redo
Sun Jun 25 09:07:46 2017
diag_adl:Waiting for all non-current ORLs to be archived...
Sun Jun 25 09:07:46 2017
diag_adl:All non-current ORLs have been archived.
Sun Jun 25 09:07:46 2017
diag_adl:Waiting for all FAL entries to be archived...
Sun Jun 25 09:07:46 2017
diag_adl:All FAL entries have been archived.
Sun Jun 25 09:07:46 2017
diag_adl:Waiting for dest_id 2 to become synchronized...
Sun Jun 25 09:07:47 2017
diag_adl:Active, synchronized Physical Standby switchover target has been identified
diag_adl:Preventing updates and queries at the Primary
diag_adl:Generating and shipping final logs to target standby
diag_adl:Switchover End-Of-Redo Log thread 1 sequence 96534 has been fixed
diag_adl:Switchover: Primary highest seen SCN set to 0x3.0x109d7502
diag_adl:ARCH: Noswitch archival of thread 1, sequence 96534
diag_adl:ARCH: End-Of-Redo Branch archival of thread 1 sequence 96534
diag_adl:ARCH: LGWR is scheduled to archive destination LOG_ARCHIVE_DEST_2 after log switch
diag_adl:ARCH: Standby redo logfile selected for thread 1 sequence 96534 for destination LOG_ARCHIVE_DEST_2
diag_adl:ARCH: Archiving is disabled due to current logfile archival
diag_adl:Primary will check for some target standby to have received all redo
diag_adl:Waiting for target standby to apply all redo
diag_adl:Backup controlfile written to trace file
            /u01/app/oracle/diag/rdbms/xifenfeildg/xifenfei/trace/xifenfei_ora_37024.trc
diag_adl:Converting the primary database to a new standby database
diag_adl:Clearing standby activation ID 612004791 (0x247a73b7)
diag_adl:The primary database controlfile was created using the
diag_adl:'MAXLOGFILES 16' clause.
diag_adl:There is space for up to 11 standby redo logfiles
diag_adl:Use the following SQL commands on the standby database to create
diag_adl:standby redo logfiles that match the primary database:
diag_adl:ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 1073741824;
diag_adl:ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 1073741824;
diag_adl:ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 1073741824;
diag_adl:ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 1073741824;
diag_adl:ALTER DATABASE ADD STANDBY LOGFILE 'srl5.f' SIZE 1073741824;
diag_adl:ALTER DATABASE ADD STANDBY LOGFILE 'srl6.f' SIZE 1073741824;
diag_adl:Archivelog for thread 1 sequence 96534 required for standby recovery
diag_adl:Switchover: Primary controlfile converted to standby controlfile succesfully.
diag_adl:Switchover: Complete - Database shutdown required
diag_adl:Sending request(convert to primary database) to switchover target xifenfei
Sun Jun 25 09:07:58 2017
diag_adl:Switchover complete. Database shutdown required
diag_adl:USER (ospid: 37024): terminating the instance
Sun Jun 25 09:07:59 2017
diag_adl:Instance terminated by USER, pid = 37024
diag_adl:Shutting down instance (abort)
diag_adl:License high water mark = 527
Sun Jun 25 09:07:59 2017
Instance shutdown complete

备库alert日志

Sun Jun 25 09:05:54 2017
SWITCHOVER VERIFY BEGIN
SWITCHOVER VERIFY COMPLETE
Sun Jun 25 09:06:35 2017
RFS[107]: Assigned to RFS process (PID:7330)
RFS[107]: Selected log 12 for thread 1 sequence 96534 dbid 588725663 branch 916962073
Sat Jun 24 20:06:35 2017
Archived Log entry 100576 added for thread 1 sequence 96534 ID 0x247a73b7 dest 1:
Sat Jun 24 20:06:35 2017
Resetting standby activation ID 612004791 (0x247a73b7)
Sat Jun 24 20:06:35 2017
Media Recovery End-Of-Redo indicator encountered
Sat Jun 24 20:06:35 2017
Media Recovery Continuing
Media Recovery Waiting for thread 1 sequence 96535
Sun Jun 25 09:06:36 2017
SWITCHOVER: received request 'ALTER DTABASE COMMIT TO SWITCHOVER  TO PRIMARY' from primary database.
Sun Jun 25 09:06:36 2017
ALTER DATABASE SWITCHOVER TO PRIMARY (xifenfei)
Maximum wait for role transition is 15 minutes.
Switchover: Media recovery is still active
Role Change: Canceling MRP - no more redo to apply
Sat Jun 24 20:06:36 2017
MRP0: Background Media Recovery cancelled with status 16037
Sat Jun 24 20:06:36 2017
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_pr00_4590.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Sat Jun 24 20:06:36 2017
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_pr00_4590.trc:
ORA-16037: user requested cancel of managed recovery operation
Sat Jun 24 20:06:37 2017
MRP0: Background Media Recovery process shutdown (xifenfei)
Sun Jun 25 09:06:38 2017
Role Change: Canceled MRP
Killing 2 processes(PIDS:7328,4704)(all RFS)
      in order to disallow current and future RFS connections.Requested by OS process 7334
Stopping Emon pool
All dispatchers and shared servers shutdown
CLOSE: killing server sessions.
Active process 5428 user 'oracle' program 'oracle@kage7.hk0620.com (TNS V1-V3)'
Active process 5161 user 'oracle' program 'oracle@kage7.hk0620.com'
…………
Active process 5428 user 'oracle' program 'oracle@kage7.hk0620.com (TNS V1-V3)'
Active process 5161 user 'oracle' program 'oracle@kage7.hk0620.com'
Active process 5178 user 'oracle' program 'oracle@kage7.hk0620.com'
CLOSE: all sessions shutdown successfully.
Stopping Emon pool
Sat Jun 24 20:06:43 2017
SMON: disabling cache recovery
Sat Jun 24 20:06:44 2017
Buffer Cache Full DB Caching mode changing from FULL CACHING DISABLED to FULL CACHING ENABLED
Sun Jun 25 09:06:44 2017
Backup controlfile written to trace file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_rmi_7334.trc
SwitchOver after complete recovery through change 13163656450
Online logfile pre-clearing operation disabled by switchover
Online log /u01/app/oracle/oradata/xifenfei/redo01n.log: Thread 1 Group 1 was previously cleared
Online log /u01/app/oracle/fast_recovery_area/xifenfei/redo01n.log: Thread 1 Group 1 was previously cleared
Online log /u01/app/oracle/oradata/xifenfei/redo02n.log: Thread 1 Group 2 was previously cleared
Online log /u01/app/oracle/fast_recovery_area/xifenfei/redo02n.log: Thread 1 Group 2 was previously cleared
Online log /u01/app/oracle/oradata/xifenfei/redo03n.log: Thread 1 Group 3 was previously cleared
Online log /u01/app/oracle/fast_recovery_area/xifenfei/redo03n.log: Thread 1 Group 3 was previously cleared
Online log /u01/app/oracle/oradata/xifenfei/redo04n.log: Thread 1 Group 4 was previously cleared
Online log /u01/app/oracle/fast_recovery_area/xifenfei/redo04n.log: Thread 1 Group 4 was previously cleared
Online log /u01/app/oracle/oradata/xifenfei/redo05n.log: Thread 1 Group 5 was previously cleared
Online log /u01/app/oracle/fast_recovery_area/xifenfei/redo05n.log: Thread 1 Group 5 was previously cleared
Standby became primary SCN: 13163656448
Switchover: Complete - Database mounted as primary
SWITCHOVER: completed request from primary database.
Sat Jun 24 20:07:12 2017
ARC0: Becoming the 'no SRL' ARCH

原备库(现主库)操作

SQL> conn / as sysdba
Connected.
SQL>  select database_role,open_mode from v$database;
DATABASE_ROLE    OPEN_MODE
---------------- --------------------
PRIMARY          MOUNTED
SQL> alter database open;
Database altered.

原主库(现备库)操作

[oracle@localhost scripts]$ ss
SQL*Plus: Release 12.1.0.2.0 Production on 星期日 6月 25 09:09:40 2017
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
已连接到空闲例程。
SQL> startup
ORACLE 例程已经启动。
Total System Global Area 8.5899E+10 bytes
Fixed Size                  7654304 bytes
Variable Size            1.2616E+10 bytes
Database Buffers         7.3014E+10 bytes
Redo Buffers              260780032 bytes
数据库装载完毕。
数据库已经打开。
SQL> alter database recover managed standby database disconnect;
数据库已更改。

mount数据库也可能有LOCAL=NO的进程

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:mount数据库也可能有LOCAL=NO的进程

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在一次无意中发现mount状态的数据库也有LOCAL=NO的进程,经过分析确定是由于主库连接到备库的nls或者arch进程连接到备库引起的
发现mount库中有LOCAL=NO的进程

[oracle@localhost ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.5.0 - Production on Fri Jul 29 11:59:57 2016
Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select database_role ,open_mode from v$database;
DATABASE_ROLE    OPEN_MODE
---------------- ----------
PHYSICAL STANDBY MOUNTED
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@localhost ~]$ ps -ef|grep LOCAL
oracle   11394     1  0 Apr27 ?        08:08:41 oracleorcl (LOCAL=NO)
oracle   11398     1  0 Apr27 ?        15:36:29 oracleorcl (LOCAL=NO)
oracle   18854 18752  0 12:00 pts/2    00:00:00 grep LOCAL
[oracle@localhost ~]$ ps -ef|grep pmon
oracle   14374     1  0  2015 ?        00:10:54 ora_pmon_orcl
oracle   18893 18752  0 12:01 pts/2    00:00:00 grep pmon
SQL>  select sid,status,username from v$session where paddr in
   2  (select addr from v$process where spid in(11394,11398));
       SID STATUS   USERNAME
---------- -------- ------------------------------
       510 INACTIVE PUBLIC
       507 INACTIVE PUBLIC

查看备库进程连接

[oracle@localhost ~]$ netstat -natp|grep -E '11394|11398'
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 192.168.160.22:1521         192.168.160.23:42783        ESTABLISHED 11394/oracleorcl
tcp        0      0 192.168.160.22:1521         192.168.160.23:42785        ESTABLISHED 11398/oracleorcl

主库上查看,确定192.168.160.22是备库

SQL> show parameter log_archive_dest_2;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_2                   string      service=orcl lgwr async valid_
                                                 for=(online_logfiles,primary_r
                                                 ole) db_unique_name=orcl
SQL> !tnsping orcl
TNS Ping Utility for Linux: Version 10.2.0.5.0 - Production on 29-JUL-2016 12:20:01
Copyright (c) 1997,  2010, Oracle.  All rights reserved.
Used parameter files:
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST =192.168.160.22)(PORT = 1521))
 (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = orcl)))
OK (0 msec)

查看主库连接

[oracle@localhost ~]$ netstat -natp|grep "192.168.160.22"
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 192.168.160.23:42785        192.168.160.22:1521         ESTABLISHED 12394/ora_arc1_orcl
tcp        0      0 192.168.160.23:42783        192.168.160.22:1521         ESTABLISHED 12400/ora_lns1_orcl

通过分析确定在mount情况的备库中,会有LOCAL=NO的进程,他们是主库arch和lns进程对应的服务进程

DataGuard ora-16157故障解决

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:DataGuard ora-16157故障解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

由于硬件故障,需要激活备库,由于登录错误服务器对本不该激活的服务器进行了如下操作
STANDBY DATABASE FINISH操作

[oracle@app73 ~]$ ss
SQL*Plus: Release 11.2.0.4.0 Production on Sun Jun 26 23:33:19 2016
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning and Data Mining options
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Database altered.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;
Database altered.

对应的alert日志提示

Sun Jun 26 23:33:22 2016
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
Sun Jun 26 23:33:22 2016
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /Data/oracle/diag/rdbms/commentdbdg/commentdb/trace/commentdb_pr00_11871.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 47440782676
Sun Jun 26 23:33:23 2016
MRP0: Background Media Recovery process shutdown (commentdb)
Managed Standby Recovery Canceled (commentdb)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
Sun Jun 26 23:33:44 2016
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH
Attempt to do a Terminal Recovery (commentdb)
Media Recovery Start: Managed Standby Recovery (commentdb)
 started logmerger process
Sun Jun 26 23:33:44 2016
Managed Standby Recovery not using Real Time Apply
Parallel Media Recovery started with 4 slaves
Media Recovery Waiting for thread 1 sequence 21196 (in transit)
Killing 4 processes with pids 11881,11867,11869,14314 (all RFS, wait for I/O)
in order to disallow current and future RFS connections. Requested by OS process 7372
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
Terminal Recovery timestamp is '06/26/2016 23:33:48'
Terminal Recovery: applying standby redo logs.
Terminal Recovery: thread 1 seq# 21196 redo required
Terminal Recovery:
Recovery of Online Redo Log: Thread 1 Group 10 Seq 21196 Reading mem 0
  Mem# 0: /Data/oracle/oradata/commentdb/std_redo10.log
Identified End-Of-Redo (failover) for thread 1 sequence 21196 at SCN 0xffff.ffffffff
Incomplete Recovery applied until change 47440782709 time 06/26/2016 23:29:51
Media Recovery Complete (commentdb)
Terminal Recovery: Enabled archive destination LOG_ARCHIVE_DEST_2
Terminal Recovery: successful completion
Forcing ARSCN to IRSCN for TR 11:196142453
Attempt to set limbo arscn 11:196142453 irscn 11:196142453
Resetting standby activation ID 3880004483 (0xe7442b83)
Sun Jun 26 23:33:48 2016
ARCH: Archival stopped, error occurred. Will continue retrying
ORACLE Instance commentdb - Archival Error
ORA-16014: log 10 sequence# 21196 not archived, no available destinations
ORA-00312: online log 10 thread 1: '/Data/oracle/oradata/commentdb/std_redo10.log'
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH
Sun Jun 26 23:39:24 2016
RFS[4]: Assigned to RFS process 7392
RFS[4]: No connections allowed during/after terminal recovery.

很明显数据库已经挺尸mrp并且成功执行了ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH

尝试重新启动mrp应用日志

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 2.1379E+10 bytes
Fixed Size                  2262656 bytes
Variable Size            2684356992 bytes
Database Buffers         1.8656E+10 bytes
Redo Buffers               36073472 bytes
Database mounted.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION;
Database altered.

alert日志提示

Sun Jun 26 23:40:39 2016
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Attempt to start background Managed Standby Recovery process (commentdb)
Sun Jun 26 23:40:39 2016
MRP0 started with pid=24, OS id=7458
MRP0: Background Managed Standby Recovery process started (commentdb)
 started logmerger process
Sun Jun 26 23:40:44 2016
Managed Standby Recovery starting Real Time Apply
MRP0: Background Media Recovery terminated with error 16157
Errors in file /Data/oracle/diag/rdbms/commentdbdg/commentdb/trace/commentdb_pr00_7460.trc:
ORA-16157: media recovery not allowed following successful FINISH recovery
Managed Standby Recovery not using Real Time Apply
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Recovery Slave PR00 previously exited with exception 16157
MRP0: Background Media Recovery process shutdown (commentdb)
[oracle@app73 trace]$ oerr ora 16157
16157, 00000, "media recovery not allowed following successful FINISH recovery"
// *Cause:  A RECOVER MANAGED STANDBY DATABASE FINISH command has previously
//          completed successfully.  Another media recovery is not allowed.
// *Action: Issue one of these operations following a FINISH recocvery:
//          ALTER DATABASE OPEN READ ONLY or
//          ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY.

前台虽然提示mrp启动成功,但是alert日志提示启动mrp进程失败,原因是由于media recovery not allowed following successful FINISH recovery,也就是说由于我开始已经执行了ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH导致无法继续启动mrp进程,也就无法继续应用日志。

查看mos看看是否有解决方案
ora-16157


根据官方的说法只能重建dg,实在不甘心,个人感觉ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH执行完成,但是我没有failover命令执行,应该数据文件没有改变,只是控制文件发生了改变。

解决ora-16157问题
从主库重新生成standby controlfile并且传输到备库,再次尝试启动mrp

--主库操作
SQL> ALTER DATABASE CREATE standby CONTROLFILE AS '/tmp/controlfs01.ctl';
Database altered.
--scp到备库
SQL> shutdown immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL> startup nomount;
ORACLE instance started.
Total System Global Area 2.1379E+10 bytes
Fixed Size                  2262656 bytes
Variable Size            2684356992 bytes
Database Buffers         1.8656E+10 bytes
Redo Buffers               36073472 bytes
[oracle@app73 ~]$ cp /tmp/controlfs01.ctl /Data/oracle/oradata/commentdb/control01.ctl
[oracle@app73 ~]$ cp /tmp/controlfs01.ctl /Data/oracle/fast_recovery_area/commentdb/control02.ctl
SQL> alter database mount;
Database altered.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION;
Database altered.

观察alert日志

ARC2: Becoming the active heartbeat ARCH
Completed: alter database mount
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Attempt to start background Managed Standby Recovery process (commentdb)
Sun Jun 26 23:46:50 2016
MRP0 started with pid=24, OS id=7547
MRP0: Background Managed Standby Recovery process started (commentdb)
Sun Jun 26 23:46:54 2016
RFS[1]: Assigned to RFS process 7553
RFS[1]: Selected log 10 for thread 1 sequence 21197 dbid -414945661 branch 893285763
Sun Jun 26 23:46:54 2016
Primary database is in MAXIMUM PERFORMANCE mode
RFS[2]: Assigned to RFS process 7555
RFS[2]: Selected log 11 for thread 1 sequence 21198 dbid -414945661 branch 893285763
Archived Log entry 1 added for thread 1 sequence 21197 ID 0xe7442b83 dest 1:
 started logmerger process
Sun Jun 26 23:46:55 2016
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 4 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Sun Jun 26 23:46:56 2016
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION
Sun Jun 26 23:47:06 2016
Media Recovery Waiting for thread 1 sequence 21196
Fetching gap sequence in thread 1, gap sequence 21196-21196
Sun Jun 26 23:47:06 2016
RFS[3]: Assigned to RFS process 7567
RFS[3]: Opened log for thread 1 sequence 21196 dbid -414945661 branch 893285763
Archived Log entry 2 added for thread 1 sequence 21196 rlc 893285763 ID 0xe7442b83 dest 2:
Media Recovery Log /Data/oracle/fast_recovery_area/COMMENTDBDG/archivelog/2016_06_26/o1_mf_1_21196_cpzy7tjc_.arc
Media Recovery Log /Data/oracle/fast_recovery_area/COMMENTDBDG/archivelog/2016_06_26/o1_mf_1_21197_cpzy7gtl_.arc
Media Recovery Waiting for thread 1 sequence 21198 (in transit)
Recovery of Online Redo Log: Thread 1 Group 11 Seq 21198 Reading mem 0
  Mem# 0: /Data/oracle/oradata/commentdb/std_redo11.log

到这里已经证明,通过重建standby controlfile实现了即使执行了ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH(ORA-16157)也可以通过不重建dg,让其恢复正常(恢复dg状态).

Oracle 12C Active Data Guard Far Sync 配置

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:Oracle 12C Active Data Guard Far Sync 配置

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

Active Data Guard Far Sync是Oracle 12c的新功能(也称为Far Sync Standby),Far Sync功能的实现是通过在距离主库(Primary Database)相对较近的地点配置Far Sync实例,主库(Primary Database) 同步(synchronous)传输redo到Far Sync实例,然后Far Sync实例再将redo异步(asynchronous)传输到终端备库(Standby Database)。这样既可以保证零数据丢失又可以降低主库压力。Far Sync实例只有密码文件,init参数文件和控制文件,而没有数据文件。 如果redo 传输采用Maximum Availability模式,我们可以在距离生产中心(Primary Database)相对较近的地点配置Far Sync实例,主库(Primary Database)同步(synchronous)传输redo到Far Sync实例,保证零数据丢失(zero data loss),同时主库和Far Sync距离较近,网络延时很小,因此对主库性能影响很小。然后Far Sync实例再将redo异步(asynchronous)发送到终端备库(Standby Database)。 如果redo 传输采用Maximum Performance模式,我们可以在距离生产中心(Primary Database)相对较近的地点配置Far Sync实例,主库(Primary Database) 异步传输redo到Far Sync实例,然后Far Sync实例再负责传输redo到其他多个终端备库(Standby Database)。这样可以减少主库向多个终端备库(Standby Database)传输redo的压力(offload)。 Far Sync配置对于Data Guard 角色转换(role transitions)是透明的,即switchover/failover命令方式与12c之前相同。
1.主库配置fra

SQL> alter system set DB_RECOVERY_FILE_DEST_SIZE=200G;
System altered.
SQL> alter system set db_recovery_file_dest='/u01/app/oracle/fast_recovery_area';
System altered.

2.启用归档模式和强制日志

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  4585912 bytes
Variable Size             671090248 bytes
Database Buffers         1828716544 bytes
Redo Buffers               12189696 bytes
Database mounted.
SQL> alter database archivelog;
Database altered.
SQL> alter database force logging;
Database altered.
SQL> alter database open;
Database altered.

3.主机规划

192.168.137.121  ora1221   --->主库(ORCL12C)
192.168.137.122  ora1222   --->Far Sync实例(ORCL12CFS)
192.168.137.123  ora1223   --->备库(ORCL12CDG)

4.tns配置

ORCL12C =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = ora1221)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = orcl12c)
    )
  )
ORCL12CDG =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = ora1223)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = orcl12c)
    )
  )
ORCL12CFS =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = ora1222)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = orcl12c)
    )
  )

5.参数文件配置

--主库
db_unique_name='orcl12c'
service_names='orcl12c'
log_archive_dest_1='LOCATION=USE_DB_RECOVERY_FILE_DEST valid_for=(all_logfiles,all_roles) db_unique_name=orcl12c'
log_archive_dest_2='service=orcl12cfs lgwr  sync  AFFIRM MAX_FAILURE=1 ALTERNATE=LOG_ARCHIVE_DEST_3 valid_for=(online_logfiles,primary_role) db_unique_name=orcl12cfs'
LOG_ARCHIVE_DEST_3='SERVICE=orcl12cdg  lgwr ASYNC ALTERNATE=LOG_ARCHIVE_DEST_2 VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=orcl12cdg'
LOG_ARCHIVE_DEST_STATE_3=ALTERNATE
log_archive_config='dg_config=(orcl12c,orcl12cdg,orcl12cfs)'
standby_file_management=auto
db_file_name_convert='/u01/app/oracle/oradata/orcl12c/','/u01/app/oracle/oradata/orcl12c/'
log_file_name_convert='/u01/app/oracle/oradata/orcl12c/','/u01/app/oracle/oradata/orcl12c/'
fal_server='orcl12cdg','orcl12cfs'
--Far Sync实例
db_unique_name='orcl12cfs'
service_names='orcl12c'
log_archive_dest_1='LOCATION=USE_DB_RECOVERY_FILE_DEST valid_for=(all_logfiles,all_roles) db_unique_name=orcl12cfs'
log_archive_dest_2='service=orcl12cdg lgwr  async valid_for=(STANDBY_LOGFILES,STANDBY_ROLE) db_unique_name=orcl12cdg'
log_archive_config='dg_config=(orcl12c,orcl12cdg,orcl12cfs)'
standby_file_management=manual
fal_server='orcl12c'
--备库
db_unique_name='orcl12cdg'
service_names='orcl12c'
log_archive_dest_1='LOCATION=USE_DB_RECOVERY_FILE_DEST valid_for=(all_logfiles,all_roles) db_unique_name=orcl12cdg'
log_archive_dest_2='service=orcl12c lgwr  async   valid_for=(online_logfiles,primary_role) db_unique_name=orcl12c'
log_archive_config='dg_config=(orcl12c,orcl12cdg,orcl12cfs)'
standby_file_management=auto
db_file_name_convert='/u01/app/oracle/oradata/orcl12c/','/u01/app/oracle/oradata/orcl12c/'
log_file_name_convert='/u01/app/oracle/oradata/orcl12c/','/u01/app/oracle/oradata/orcl12c/'
fal_server='orcl12c','orcl12cfs'

6.密码文件
拷贝主库的密码文件到Far Sync实例和备库$ORACLE_HOME/dbs下(如果是win在%ORACLE_HOME%/database中)

7.创建Far Sync实例
创建和参数文件匹配的fra,adump目录

--主库
ALTER DATABASE CREATE FAR SYNC INSTANCE CONTROLFILE AS '/tmp/controlfs01.ctl';
拷贝到Far Sync实例对应的控制文件位置
--Far Sync实例
startup pfile='/tmp/pfile' mount
ALTER DATABASE ADD STANDBY LOGFILE thread 1  GROUP 10 '/u01/app/oracle/oradata/orcl12c/std_redo10.log' size 50m reuse;
ALTER DATABASE ADD STANDBY LOGFILE thread 1  GROUP 11 '/u01/app/oracle/oradata/orcl12c/std_redo11.log' size 50m reuse;
ALTER DATABASE ADD STANDBY LOGFILE thread 1  GROUP 12 '/u01/app/oracle/oradata/orcl12c/std_redo12.log' size 50m reuse;
ALTER DATABASE ADD STANDBY LOGFILE thread 1  GROUP 13 '/u01/app/oracle/oradata/orcl12c/std_redo13.log' size 50m reuse;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION;

8.创建备库
创建和参数文件匹配的fra,adump目录

--主库备份(传输至备库)
backup database format '/tmp/ora12c_%U';
--主库创建standby ctl(传输至备库和备库参数文件路径一致)
alter database create standby controlfile as '/tmp/controlst.ctl';
--启动备库至mount
startup pfile='/tmp/pfile' mount;
--备库注册备份
catalog start with '/tmp/xifenfei/';
--备库还原数据文件并恢复
restore database;
recover database;
--增加standby redo
ALTER DATABASE ADD STANDBY LOGFILE thread 1  GROUP 10 '/u01/app/oracle/oradata/orcl12c/std_redo10.log' size 50m reuse;
ALTER DATABASE ADD STANDBY LOGFILE thread 1  GROUP 11 '/u01/app/oracle/oradata/orcl12c/std_redo11.log' size 50m reuse;
ALTER DATABASE ADD STANDBY LOGFILE thread 1  GROUP 12 '/u01/app/oracle/oradata/orcl12c/std_redo12.log' size 50m reuse;
ALTER DATABASE ADD STANDBY LOGFILE thread 1  GROUP 13 '/u01/app/oracle/oradata/orcl12c/std_redo13.log' size 50m reuse;
--备库开启mrp进程
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT  LOGFILE DISCONNECT FROM SESSION;
--主库增加standby redo
ALTER DATABASE ADD STANDBY LOGFILE thread 1  GROUP 10 '/u01/app/oracle/oradata/orcl12c/std_redo10.log' size 50m reuse;
ALTER DATABASE ADD STANDBY LOGFILE thread 1  GROUP 11 '/u01/app/oracle/oradata/orcl12c/std_redo11.log' size 50m reuse;
ALTER DATABASE ADD STANDBY LOGFILE thread 1  GROUP 12 '/u01/app/oracle/oradata/orcl12c/std_redo12.log' size 50m reuse;
ALTER DATABASE ADD STANDBY LOGFILE thread 1  GROUP 13 '/u01/app/oracle/oradata/orcl12c/std_redo13.log' size 50m reuse;

9.配置结果

SQL> select * from V$DATAGUARD_CONFIG;
DB_UNIQUE_NAME                 PARENT_DBUN                    DEST_ROLE         CURRENT_SCN     CON_ID
------------------------------ ------------------------------ ----------------- ----------- ----------
orcl12c                        NONE                           PRIMARY DATABASE      1950551          0
orcl12cfs                      orcl12c                        FAR SYNC INSTANCE     1950390          0
orcl12cdg                      orcl12cfs                      PHYSICAL STANDBY      1950390          0
SQL> select group#, status, thread#, sequence#, first_change#, next_change# from v$standby_log;
    GROUP# STATUS        THREAD#  SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
---------- ---------- ---------- ---------- ------------- ------------
        10 ACTIVE              1         27       1863140
        11 UNASSIGNED          1          0
        12 UNASSIGNED          1          0
        13 UNASSIGNED          1          0
SQL> select PROTECTION_MODE,PROTECTION_LEVEL from v$database;
PROTECTION_MODE      PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM PERFORMANCE  MAXIMUM PERFORMANCE

10.调整保护级别
从MAXIMUM PERFORMANCE调整为MAXIMUM AVAILABILITY

--主库
SQL> startup mount;
ORACLE instance started.
Total System Global Area 2516582400 bytes
Fixed Size                  4585912 bytes
Variable Size             671090248 bytes
Database Buffers         1828716544 bytes
Redo Buffers               12189696 bytes
Database mounted.
SQL> alter database set standby database to maximize availability;
Database altered.
SQL>  select PROTECTION_MODE,PROTECTION_LEVEL from v$database;
PROTECTION_MODE      PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM AVAILABILITY UNPROTECTED
SQL> alter database open;
Database altered.
SQL>  select PROTECTION_MODE,PROTECTION_LEVEL from v$database;
PROTECTION_MODE      PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM AVAILABILITY MAXIMUM AVAILABILITY
--Far Sync实例
SQL>  select PROTECTION_MODE,PROTECTION_LEVEL from v$database;
PROTECTION_MODE      PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM AVAILABILITY MAXIMUM AVAILABILITY
--备库
SQL> select PROTECTION_MODE,PROTECTION_LEVEL from v$database;
PROTECTION_MODE      PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM AVAILABILITY RESYNCHRONIZATION

分享一例由于主库逻辑坏块导致dataguard容灾失效

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:分享一例由于主库逻辑坏块导致dataguard容灾失效

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友数据库配置了物理dataguard,但是由于触发了某种Oracle bug/或者其他原因导致block出现大量逻辑坏块,结果是主备库(主库启动后就crash,备库failover后也启动后就crash)都无法正常工作,请求给予技术支持。
数据库配置了物理dataguard,但是主库在进行了某些操作之后,主库直接crash.重启主库发现,数据库启动之后,稍后数据库继续CRASH

Wed Jul 08 16:32:22 2015
Thread 1 advanced to log sequence 401531 (LGWR switch)
  Current log# 6 seq# 401531 mem# 0: /opt/oracle/database/fast_recovery_area/xifenfei/onlinelog/o1_mf_6_b9p7qhrm_.log
Archived Log entry 605194 added for thread 1 sequence 401530 ID 0xfbe74bc9 dest 1:
ARC3: Standby redo logfile selected for thread 1 sequence 401530 for destination LOG_ARCHIVE_DEST_2
******************************************************************
LGWR: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2
******************************************************************
Wed Jul 08 16:33:02 2015
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p032_413611.trc  (incident=201910):
ORA-00600: internal error code, arguments: [17114], [0x7FB03BB6DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FB03BB6DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/incident/incdir_201910/xifenfei_p032_413611_i201910.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p032_413611.trc  (incident=201911):
ORA-00600: internal error code, arguments: [kghfrempty:ds], [0x7FB03BB6DBE8], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17114], [0x7FB03BB6DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FB03BB6DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/incident/incdir_201911/xifenfei_p032_413611_i201911.trc
Wed Jul 08 16:34:23 2015
SMON: slave died unexpectedly, downgrading to serial recovery
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_413207.trc  (incident=201547):
ORA-00600: internal error code, arguments: [17182], [0x7F24D680D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/incident/incdir_201547/xifenfei_smon_413207_i201547.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Block recovery from logseq 401530, block 119 to scn 73226510040
Recovery of Online Redo Log: Thread 1 Group 5 Seq 401530 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/xifenfei/onlinelog/o1_mf_5_b9p7qhl0_.log
Recovery of Online Redo Log: Thread 1 Group 6 Seq 401531 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/xifenfei/onlinelog/o1_mf_6_b9p7qhrm_.log
Block recovery completed at rba 401531.1882.16, scn 17.212066009
ORACLE Instance xifenfei (pid = 16) - Error 600 encountered while recovering transaction (13, 14) on object 135520.
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_413207.trc:
ORA-00600: internal error code, arguments: [17182], [0x7F24D680D7A0], [], [], [], [], [], [], [], [], [], []
Wed Jul 08 16:34:24 2015
Dumping diagnostic data in directory=[cdmp_20150708163424], requested by (instance=1, osid=413207 (SMON)), summary=[incident=201547].
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_413207.trc  (incident=201548):
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7F24D680D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7F24D680D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/incident/incdir_201548/xifenfei_smon_413207_i201548.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_smon_413207.trc:
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7F24D680D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7F24D680D7A0], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 413207): terminating the instance due to error 474
System state dump requested by (instance=1, osid=413207 (SMON)), summary=[abnormal instance termination].
System State dumped to trace file /opt/oracle/database/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_diag_413167_20150708163426.trc
Dumping diagnostic data in directory=[cdmp_20150708163426], requested by (instance=1, osid=413207 (SMON)), summary=[abnormal instance termination].
Instance terminated by SMON, pid = 413207

由于主库不能正常open,备库直接failover方式激活

Wed Jul 08 17:56:41 2015
alter database recover managed standby database finish
Terminal Recovery: request posted (xffdb)
Wed Jul 08 17:56:45 2015
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
Terminal Recovery timestamp is '07/08/2015 17:56:45'
Terminal Recovery: applying standby redo logs.
Terminal Recovery: thread 1 seq# 401533 redo required
Terminal Recovery:
Recovery of Online Redo Log: Thread 1 Group 7 Seq 401533 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_7_bb7x31lk_.log
Identified End-Of-Redo (failover) for thread 1 sequence 401533 at SCN 0xffff.ffffffff
Incomplete Recovery applied until change 73226530800 time 07/08/2015 16:56:40
Terminal Recovery: successful completion
Wed Jul 08 17:56:45 2015
ARCH: Archival stopped, error occurred. Will continue retrying
Forcing ARSCN to IRSCN for TR 17:212086768
ORACLE Instance xffdb - Archival Error
Attempt to set limbo arscn 17:212086768 irscn 17:212086768 ORA-16014: log 7 sequence# 401533 not archived, no available destinations
ORA-00312: online log 7 thread 1: '/opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_7_bb7x31lk_.log'
Resetting standby activation ID 4226239433 (0xfbe74bc9)
Wed Jul 08 17:56:45 2015
MRP0: Media Recovery Complete (xffdb)
MRP0: Background Media Recovery process shutdown (xffdb)
Terminal Recovery: completion detected (xffdb)
Completed: alter database recover managed standby database finish
Wed Jul 08 17:56:58 2015
alter database commit to switchover to primary
ALTER DATABASE SWITCHOVER TO PRIMARY (xffdb)
Maximum wait for role transition is 15 minutes.
Backup controlfile written to trace file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_ora_485094.trc
Standby terminal recovery start SCN: 73226530482
RESETLOGS after incomplete recovery UNTIL CHANGE 73226530800
Online logfile pre-clearing operation disabled by switchover
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_1_bb7x30gw_.log: Thread 1 Group 1 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_2_bb7x30js_.log: Thread 1 Group 2 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_3_bb7x310q_.log: Thread 1 Group 3 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_4_bb7x312r_.log: Thread 1 Group 4 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_5_bb7x317f_.log: Thread 1 Group 5 was previously cleared
Online log /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_6_bb7x31cz_.log: Thread 1 Group 6 was previously cleared
Standby became primary SCN: 73226530481
Wed Jul 08 17:56:58 2015
Setting recovery target incarnation to 3
Switchover: Complete - Database mounted as primary
Completed: alter database commit to switchover to primary

数据库激活成功后,重启激活之后数据库发现和主库出现类似情况

Wed Jul 08 17:57:25 2015
ALTER DATABASE   MOUNT
Successful mount of redo thread 1, with mount id 4243462021
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Wed Jul 08 17:57:30 2015
ALTER DATABASE OPEN
Assigning activation ID 4243462021 (0xfcee1785)
LGWR: STARTING ARCH PROCESSES
Wed Jul 08 17:57:30 2015
ARC0 started with pid=23, OS id=485230
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Thread 1 advanced to log sequence 2 (thread open)
Wed Jul 08 17:57:31 2015
ARC1 started with pid=24, OS id=485236
Wed Jul 08 17:57:31 2015
ARC2 started with pid=25, OS id=485240
Wed Jul 08 17:57:31 2015
ARC3 started with pid=26, OS id=485244
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Thread 1 opened at log sequence 2
  Current log# 2 seq# 2 mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_2_bb7x30js_.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Wed Jul 08 17:57:31 2015
NSA2 started with pid=27, OS id=485248
[485226] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:3453451748 end:3453452018 diff:270 (2 seconds)
Dictionary check beginning
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Wed Jul 08 17:57:32 2015
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_p032_485380.trc  (incident=64481):
ORA-00600: internal error code, arguments: [17182], [0x7FE96B50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64481/xffdb_p032_485380_i64481.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Starting background process QMNC
Wed Jul 08 17:57:32 2015
QMNC started with pid=92, OS id=485512
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Dumping diagnostic data in directory=[cdmp_20150708175733], requested by (instance=1, osid=485380 (P032)), summary=[incident=64481].
Thread 1 advanced to log sequence 3 (LGWR switch)
  Current log# 3 seq# 3 mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_3_bb7x310q_.log
ARC3: STARTING ARCH PROCESSES
Wed Jul 08 17:57:34 2015
ARC4 started with pid=93, OS id=485516
Wed Jul 08 17:57:35 2015
db_recovery_file_dest_size of 204800 MB is 0.41% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
ARC4: Archival started
ARC3: STARTING ARCH PROCESSES COMPLETE
krsk_srl_archive_int: Enabling archival of deferred physical standby SRLs
Archived Log entry 273963 added for thread 1 sequence 2 ID 0xfcee1785 dest 1:
Archived Log entry 273964 added for thread 1 sequence 401533 ID 0xfbe74bc9 dest 1:
Completed: ALTER DATABASE OPEN
Wed Jul 08 17:57:36 2015
Starting background process CJQ0
Wed Jul 08 17:57:36 2015
CJQ0 started with pid=95, OS id=485554
Shutting down archive processes
ARCH shutting down
ARC4: Archival stopped
Wed Jul 08 17:57:41 2015
Thread 1 advanced to log sequence 4 (LGWR switch)
  Current log# 4 seq# 4 mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_4_bb7x312r_.log
Wed Jul 08 17:57:41 2015
Archived Log entry 273965 added for thread 1 sequence 3 ID 0xfcee1785 dest 1:
Wed Jul 08 17:58:30 2015
Sweep [inc][64481]: completed
Sweep [inc2][64481]: completed
Wed Jul 08 17:58:31 2015
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_p032_485380.trc  (incident=64482):
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FE96B50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64482/xffdb_p032_485380_i64482.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jul 08 17:58:32 2015
Dumping diagnostic data in directory=[cdmp_20150708175832], requested by (instance=1, osid=485380 (P032)), summary=[incident=64482].
Wed Jul 08 17:58:36 2015
Thread 1 advanced to log sequence 5 (LGWR switch)
  Current log# 5 seq# 5 mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_5_bb7x317f_.log
Wed Jul 08 17:59:02 2015
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_p032_485380.trc  (incident=64483):
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FE96B50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64483/xffdb_p032_485380_i64483.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jul 08 17:59:04 2015
Dumping diagnostic data in directory=[cdmp_20150708175904], requested by (instance=1, osid=485380 (P032)), summary=[incident=64483].
Wed Jul 08 17:59:29 2015
Sweep [inc][64483]: completed
Sweep [inc][64482]: completed
Sweep [inc2][64483]: completed
Sweep [inc2][64482]: completed
Wed Jul 08 17:59:30 2015
Block recovery from logseq 2, block 104 to scn 73226531646
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_2_bb7x30js_.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_3_bb7x310q_.log
Recovery of Online Redo Log: Thread 1 Group 4 Seq 4 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_4_bb7x312r_.log
Recovery of Online Redo Log: Thread 1 Group 5 Seq 5 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_5_bb7x317f_.log
Block recovery stopped at EOT rba 5.765.16
Block recovery completed at rba 5.765.16, scn 17.212087614
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_p032_485380.trc  (incident=64484):
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7FE96B50D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17114], [0x7FE96B50DBD0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7FE96B50DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64484/xffdb_p032_485380_i64484.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Dumping diagnostic data in directory=[cdmp_20150708175934], requested by (instance=1, osid=485380 (P032)), summary=[incident=64487].
Wed Jul 08 17:59:36 2015
SMON: slave died unexpectedly, downgrading to serial recovery
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_smon_485190.trc  (incident=64129):
ORA-00600: internal error code, arguments: [17182], [0x7F5AED10D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64129/xffdb_smon_485190_i64129.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Block recovery from logseq 2, block 104 to scn 73226531646
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_2_bb7x30js_.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_3_bb7x310q_.log
Recovery of Online Redo Log: Thread 1 Group 4 Seq 4 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_4_bb7x312r_.log
Recovery of Online Redo Log: Thread 1 Group 5 Seq 5 Reading mem 0
  Mem# 0: /opt/oracle/database/fast_recovery_area/DBHLDB/onlinelog/o1_mf_5_bb7x317f_.log
Block recovery completed at rba 5.765.16, scn 17.212087615
ORACLE Instance xffdb (pid = 16) - Error 600 encountered while recovering transaction (13, 14) on object 135520.
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_smon_485190.trc:
ORA-00600: internal error code, arguments: [17182], [0x7F5AED10D7A0], [], [], [], [], [], [], [], [], [], []
Dumping diagnostic data in directory=[cdmp_20150708175937], requested by (instance=1, osid=485190 (SMON)), summary=[incident=64129].
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_smon_485190.trc  (incident=64130):
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7F5AED10D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7F5AED10D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/database/diag/rdbms/xffdb/xffdb/incident/incdir_64130/xffdb_smon_485190_i64130.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_smon_485190.trc:
ORA-00600: internal error code, arguments: [KSMFPG2], [0x7F5AED10D000], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x7F5AED10D7A0], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 485190): terminating the instance due to error 474
System state dump requested by (instance=1, osid=485190 (SMON)), summary=[abnormal instance termination].
System State dumped to trace file /opt/oracle/database/diag/rdbms/xffdb/xffdb/trace/xffdb_diag_485150_20150708175939.trc
Dumping diagnostic data in directory=[cdmp_20150708175939], requested by (instance=1, osid=485190 (SMON)), summary=[abnormal instance termination].
Instance terminated by SMON, pid = 485190

通过进一步分析确定是67号文件有异常,使用dbv检查该文件发现

DBVERIFY: Release 11.2.0.4.0 - Production on Thu Jul 9 11:08:27 2015
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /opt/oracle/database/oradata/xffdb/dk_adv_ts_main_25.dbf
Block Checking: DBA = 283035270, Block Type = KTB-managed data block
data header at 0x7f87813ea064
kdbchk: the amount of space used is not equal to block size
        used=7383 fsc=0 avsp=873 dtl=8088
Page 2016902 failed with check code 6110
Block Checking: DBA = 283035271, Block Type = KTB-managed data block
data header at 0x7f87813ec064
kdbchk: the amount of space used is not equal to block size
        used=8676 fsc=0 avsp=832 dtl=8088
Page 2016903 failed with check code 6110
Block Checking: DBA = 283035272, Block Type = KTB-managed data block
data header at 0x7f87813ee064
kdbchk: avsp(1306) > tosp(894)
Page 2016904 failed with check code 6128
Block Checking: DBA = 283035273, Block Type = KTB-managed data block
data header at 0x7f87813f0064
kdbchk: the amount of space used is not equal to block size
        used=7506 fsc=0 avsp=815 dtl=8088
Page 2016905 failed with check code 6110
Block Checking: DBA = 283035274, Block Type = KTB-managed data block
data header at 0x7f87813f2064
kdbchk: the amount of space used is not equal to block size
        used=7892 fsc=0 avsp=884 dtl=8088
Page 2016906 failed with check code 6110
…………
Block Checking: DBA = 283035368, Block Type = KTB-managed data block
data header at 0x7f87814ae064
kdbchk: the amount of space used is not equal to block size
        used=7934 fsc=0 avsp=837 dtl=8088
Page 2017000 failed with check code 6110
Block Checking: DBA = 283035369, Block Type = KTB-managed data block
data header at 0x7f87814b0064
kdbchk: the amount of space used is not equal to block size
        used=7683 fsc=0 avsp=883 dtl=8088
Page 2017001 failed with check code 6110
Block Checking: DBA = 283035370, Block Type = KTB-managed data block
data header at 0x7f87814b2064
kdbchk: the amount of space used is not equal to block size
        used=8556 fsc=0 avsp=841 dtl=8088
Page 2017002 failed with check code 6110
Block Checking: DBA = 283035371, Block Type = KTB-managed data block
data header at 0x7f87814b4064
kdbchk: the amount of space used is not equal to block size
        used=7460 fsc=0 avsp=822 dtl=8088
Page 2017003 failed with check code 6110
DBVERIFY - Verification complete
Total Pages Examined         : 4063232
Total Pages Processed (Data) : 2724435
Total Pages Failing   (Data) : 102
Total Pages Processed (Index): 1064839
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 273957
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 212065913 (17.212065913)

把datafile 67给offline之后,数据库open正常,也不再crash。这里比较明显,导致主库和被动都出现异常,主库直接crash,然后重启后,一会儿就crash;备库激活后,重启后一会儿也就crash;两个库现象相同。最后通过dbv定位到是由于某个文件出现大量逻辑坏块,导致数据库open之后,进行回滚之时crash.对于此类故障,可以通过屏蔽事务回滚,并且通过alert日志和trace文件定位到异常对象,可以删除异常对象可以暂时解决该问题。而导致数据库出现类似问题。
通过这个案例,可以的出来,由于oracle某种bug或者其他原因,导致block 逻辑损坏,而且这个损坏会传输到备库,导致备库也异常,oracle的备份,容灾不能全部依赖物理dataguard容灾。因此在条件允许情况下,建议增加物理备份和逻辑容灾(类似OGG)