ORACLE 12C ASM 新特性:共享密码文件

在ORACLE 12C之前大家都知道密码文件是存放在?/dbs或者?/database中,如果要修改修改sysdba权限的用户密码时候,会去修改密码文件,而在rac数据库的sys密码文件是存在各个节点中,这个时候修改sysdba权限的密码就需要在两个节点都要做同样的操作,而对于数据库来说本身是只要在一个节点上修改即可,因为密码是记录在user$中,就是因为密码文件非共享且在各个节点中都有,因此需要在各个节点均要执行修改密码命令,确保密码文件被正常修改。因为rac 密码文件非共享的机制存在,导致修改sysdba权限密码繁琐,有些时候甚至有节点忘记修改,导致需要使用密码文件操作数据库的时候不能正常进行,DG传输日志异常等故障。在ORACLE 12C中为了解决这个问题,引入了密码文件可以存入ASM新特性,从而使得密码文件存储在ASM中实现所有节点共享,从而解决该问题.
ASM存储密码文件前提条件 COMPATIBLE.ASM>= 12.1
查询ASM信息

SQL>  select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0
PL/SQL Release 12.1.0.1.0 - Production                                                    0
CORE    12.1.0.1.0      Production                                                        0
TNS for Linux: Version 12.1.0.1.0 - Production                                            0
NLSRTL Version 12.1.0.1.0 - Production                                                    0
SQL> select NAME,COMPATIBILITY from v$asm_diskgroup;
NAME                           COMPATIBILITY
------------------------------ ------------------------------------------------------------
DATA                           12.1.0.0.0

查询crs中关于db配置

[grid@xifenfei ~]$  srvctl config database -d cdb
Database unique name: cdb
Database name: cdb
Oracle home: /u01/app/oracle/product/12.1/db_1
Oracle user: oracle
Spfile: +DATA/cdb/spfilecdb.ora
Password file:
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: MANUAL
Database instance: cdb
Disk Groups: DATA
Services:

这里db的password file为空,即表示使用默认值,也就是为$ORACLE_HOME/dbs/orapwxifenfei

创建密码文件存储在ASM中

--创建db新密码文件
[oracle@xifenfei ~]$ orapwd file='+data/CDB/orapwdxifenfei' dbuniquename='cdb'
Enter password for SYS:
----输入sys用户密码
--创建asm新密码文件
orapwd file='+data/ASM/orapwasm' asm=y
----asm=y 表示创建的密码文件为asm的
--使用老密码文件创建db/asm新密码文件
orapwd input_file='/oraclegrid/dbs/orapwasm' file='+data/ASM/orapwasm' [asm=y]
----input_file 表示使用老的密码文件创建新的存储在ASM中的密码文件

查看ASM中密码文件

ASMCMD> showversion
ASM version         : 12.1.0.1.0
ASMCMD> pwd
+data/cdb
ASMCMD>  ls -l orapwdxifenfei
Type      Redund  Striped  Time             Sys  Name
PASSWORD  UNPROT  COARSE   MAY 31 19:00:00  N    orapwdxifenfei => +DATA/CDB/PASSWORD/pwdcdb.290.816897265

配置crs中password file项

[grid@xifenfei ~]$ srvctl modify database -db cdb -pwfile  +data/CDB/orapwdxifenfei

查询crs中关于db配置

[grid@xifenfei ~]$  srvctl config database -d cdb
Database unique name: cdb
Database name: cdb
Oracle home: /u01/app/oracle/product/12.1/db_1
Oracle user: oracle
Spfile: +DATA/cdb/spfilecdb.ora
Password file: +data/CDB/orapwdxifenfei
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: MANUAL
Database instance: cdb
Disk Groups: DATA
Services:

至此数据库启动使用密码ASM中的密码文件完成,补充说明,该方式配置在ASM中的密码文件,只能是通过crs方式启动db才会生效,如果手工使用sqlplus启动数据库不会使用该密码文件,还是使用默认密码文件。这里也就提醒大家操作规范:在RAC环境(包含单节点的GI环境)中,对数据库的启动关闭操作强烈建议使用crs相关命令来完成,而不推荐使用sqlplus命令

跳过rman坏块恢复

在有些情况下,我们仅有一份rman备份,而这个时候rman 备份有出现坏块,使得我们的还原/恢复工作无法继续下去,导致数据大量丢失.我们可以通过设置event 19548/19549来跳过坏块,最大程度抢救数据
rman备份数据文件

C:\Users\XIFENFEI>rman target /
Recovery Manager: Release 11.2.0.3.0 - Production on Thu Jun 6 20:31:19 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
connected to target database: XIFENFEI (DBID=1422012639)
RMAN> backup tablespace users format 'f:/users_bak.rman';
Starting backup at 06-JUN-13
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=197 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00004 name=E:\ORACLE\ORADATA\XIFENFEI\USERS01.DBF
channel ORA_DISK_1: starting piece 1 at 06-JUN-13
channel ORA_DISK_1: finished piece 1 at 06-JUN-13
piece handle=F:\USERS_BAK.RMAN tag=TAG20130606T203154 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 06-JUN-13

切换归档日志

SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> /
System altered.
SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            E:\oracle\product\11.2.0\dbhome_1\RDBMS
Oldest online log sequence     95
Next log sequence to archive   97
Current log sequence           97

重命名数据文件

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
--------------------------------------
e:\oracle\oradata\XIFENFEI>move USERS01.DBF USERS01_bak.DBF
移动了         1 个文件。
--------------------------------------
SQL> startup
ORACLE instance started.
Total System Global Area  418484224 bytes
Fixed Size                  1385052 bytes
Variable Size             327159204 bytes
Database Buffers           83886080 bytes
Redo Buffers                6053888 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 4 - see DBWR trace file
ORA-01110: data file 4: 'E:\ORACLE\ORADATA\XIFENFEI\USERS01.DBF'

破坏备份集
破坏前


破坏后


这里很明显,我通过ue把rman备份集中的T修改为了A,肯定破坏了文件,使之出现坏块

rman还原数据文件

C:\Users\XIFENFEI>rman target /
Recovery Manager: Release 11.2.0.3.0 - Production on Thu Jun 6 21:02:41 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
connected to target database: XIFENFEI (DBID=1422012639, not open)
RMAN> restore datafile 4;
Starting restore at 06-JUN-13
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=63 device type=DISK
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00004 to E:\ORACLE\ORADATA\XIFENFEI\USERS
01.DBF
channel ORA_DISK_1: reading from backup piece F:\USERS_BAK.RMAN
channel ORA_DISK_1: ORA-19870: error while restoring backup piece F:\USERS_BAK.R
MAN
ORA-19612: datafile 4 not restored due to missing or corrupt data
failover to previous backup
creating datafile file number=4 name=E:\ORACLE\ORADATA\XIFENFEI\USERS01.DBF
Finished restore at 06-JUN-13

这里可以清晰的看到rman报ORA-19612错误,restore 失败,alert日志为:

Thu Jun 06 21:02:31 2013
ALTER DATABASE OPEN
Errors in file E:\ORACLE\diag\rdbms\xifenfei\xff\trace\xff_dbw0_7400.trc:
ORA-01157: ????/?????? 4 - ??? DBWR ????
ORA-01110: ???? 4: 'E:\ORACLE\ORADATA\XIFENFEI\USERS01.DBF'
ORA-27041: ??????
OSD-04002: unable to open file
O/S-Error: (OS 2) 系统找不到指定的文件。
Errors in file E:\ORACLE\diag\rdbms\xifenfei\xff\trace\xff_ora_4272.trc:
ORA-01157: cannot identify/lock data file 4 - see DBWR trace file
ORA-01110: data file 4: 'E:\ORACLE\ORADATA\XIFENFEI\USERS01.DBF'
ORA-1157 signalled during: ALTER DATABASE OPEN...
Thu Jun 06 21:02:33 2013
Checker run found 1 new persistent data failures
Thu Jun 06 21:03:23 2013
Corrupt block 101 found during reading backup piece, file=F:\USERS_BAK.RMAN, corr_type=3
Reread of blocknum=101, file=F:\USERS_BAK.RMAN, found same corrupt data
Reread of blocknum=101, file=F:\USERS_BAK.RMAN, found same corrupt data
Reread of blocknum=101, file=F:\USERS_BAK.RMAN, found same corrupt data
Reread of blocknum=101, file=F:\USERS_BAK.RMAN, found same corrupt data
Reread of blocknum=101, file=F:\USERS_BAK.RMAN, found same corrupt data
Continuing reading piece F:\USERS_BAK.RMAN, no other copies available.

rman备份集有坏块,导致rman还原无法正常进行下去,还原后的数据文件大小


观察已经正常还原出来数据文件情况

SQL> select CHECKPOINT_CHANGE#,file# from v$datafile_header;
CHECKPOINT_CHANGE#      FILE#
------------------ ----------
           1571582          1
           1571582          2
           1571582          3
             18379          4
           1571582          5
           1571582          6
           1571582          7
SQL> recover database datafile 4 ;
ORA-00274: illegal recovery option DATAFILE
SQL> recover datafile 4;
ORA-00279: change 18379 generated at 01/20/2013 17:13:56 needed for thread 1
ORA-00289: suggestion :
E:\ORACLE\PRODUCT\11.2.0\DBHOME_1\RDBMS\ARC0000000001_0805223583.0001
ORA-00280: change 18379 for thread 1 is in sequence #1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

rman只是还原了很小的一部分文件,做恢复提示需要从归档日志seq 1开始(某些情况可能需要其他归档,总之不是正常情况),证明rman还原异常

设置event事件还原

SQL> shutdown abort;
ORACLE instance shut down.
SQL> startup pfile='e:/pfile.txt' mount;
ORACLE instance started.
Total System Global Area  418484224 bytes
Fixed Size                  1385052 bytes
Variable Size             327159204 bytes
Database Buffers           83886080 bytes
Redo Buffers                6053888 bytes
Database mounted.
SQL> show parameter event;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
event                                string      19548 trace name context forev
                                                 er, 19549 trace name context f
                                                 orever
Event 19548:This will attempt to restore content of the corrupted block if it is possible.
Event 19549:This will suppress erroring out during restore

rman还原数据文件

RMAN> restore datafile 4;
Starting restore at 06-JUN-13
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=63 device type=DISK
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00004 to E:\ORACLE\ORADATA\XIFENFEI\USERS
01.DBF
channel ORA_DISK_1: reading from backup piece F:\USERS_BAK.RMAN
channel ORA_DISK_1: piece handle=F:\USERS_BAK.RMAN tag=TAG20130606T203154
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:35
Finished restore at 06-JUN-13

这里证明数据库rman有坏块通过rman还原成功,alert日志提示如下

Thu Jun 06 21:29:53 2013
WARNING: The block that appears to be block number 100
         in file 4 is corrupt in backup piece F:\USERS_BAK.RMAN.
         Such blocks would usually be formatted as empty
         in the restored file, but event 19548 has been
         set to include the block as-is in the restored
         file.
Corrupt block 102 found during reading backup piece, file=F:\USERS_BAK.RMAN, corr_type=-2
Reread of blocknum=102, file=F:\USERS_BAK.RMAN, found same corrupt data
Reread of blocknum=102, file=F:\USERS_BAK.RMAN, found same corrupt data
Reread of blocknum=102, file=F:\USERS_BAK.RMAN, found same corrupt data
Reread of blocknum=102, file=F:\USERS_BAK.RMAN, found same corrupt data
Reread of blocknum=102, file=F:\USERS_BAK.RMAN, found same corrupt data
Continuing reading piece F:\USERS_BAK.RMAN, no other copies available.
…………
Corrupt block 258 found during reading backup piece, file=F:\USERS_BAK.RMAN, corr_type=-2
Reread of blocknum=258, file=F:\USERS_BAK.RMAN, found same corrupt data
Reread of blocknum=258, file=F:\USERS_BAK.RMAN, found same corrupt data
Reread of blocknum=258, file=F:\USERS_BAK.RMAN, found same corrupt data
Reread of blocknum=258, file=F:\USERS_BAK.RMAN, found same corrupt data
Reread of blocknum=258, file=F:\USERS_BAK.RMAN, found same corrupt data
Continuing reading piece F:\USERS_BAK.RMAN, no other copies available.
WARNING: some data in the backup of file 4 was missing
         or corrupt.  Event 19549 has been set to allow
         the file to be restored anyway.
           backup header block count: 5369
           backup actual block count: 5212
              backup header checksum: -218250743
              backup actual checksum: 1442665538
Full restore complete of datafile 4 E:\ORACLE\ORADATA\XIFENFEI\USERS01.DBF.  Elapsed time: 0:00:25
  checkpoint is 1570136
  last deallocation scn is 1508457

这里rman还原依然遇到很多坏块,但是均跳过坏块,还是完整的恢复出来的数据文件(大小)


rman还原数据文件

RMAN> recover datafile 4;
Starting recover at 06-JUN-13
using channel ORA_DISK_1
starting media recovery
archived log for thread 1 with sequence 94 is already on disk as file E:\ORACLE\
PRODUCT\11.2.0\DBHOME_1\RDBMS\ARC0000000094_0805223583.0001
archived log for thread 1 with sequence 95 is already on disk as file E:\ORACLE\
PRODUCT\11.2.0\DBHOME_1\RDBMS\ARC0000000095_0805223583.0001
archived log for thread 1 with sequence 96 is already on disk as file E:\ORACLE\
PRODUCT\11.2.0\DBHOME_1\RDBMS\ARC0000000096_0805223583.0001
archived log file name=E:\ORACLE\PRODUCT\11.2.0\DBHOME_1\RDBMS\ARC0000000094_080
5223583.0001 thread=1 sequence=94
media recovery complete, elapsed time: 00:00:00
Finished recover at 06-JUN-13

这里可以明显的看到在recover过程中数据库应用的是备份后的所有归档,数据文件是正常被还原出来(坏块除外)

查询对象

SQL> alter database open;
Database altered.
SQL> conn test/test
Connected.
SQL> select * from tab;
TNAME                          TABTYPE  CLUSTERID
------------------------------ ------- ----------
STB101                         TABLE
SQL> select count(*) from stb101;
select count(*) from stb101
                     *
ERROR at line 1:
ORA-08103: object no longer exists

dbv检查坏块

e:\oracle\oradata\XIFENFEI>dbv file=USERS01.DBF
DBVERIFY: Release 11.2.0.3.0 - Production on Thu Jun 6 23:59:49 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = E:\ORACLE\ORADATA\XIFENFEI\USERS01.DBF
Page 100 is marked corrupt
Corrupt block relative dba: 0x01000064 (file 4, block 100)
Bad check value found during dbv:
Data in bad block:
 type: 30 format: 2 rdba: 0x01000064
 last change scn: 0x0000.00004890 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x48901e01
 check value in block header: 0x8311
 computed block checksum: 0x20
DBVERIFY - Verification complete
Total Pages Examined         : 12320
Total Pages Processed (Data) : 4952
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 7069
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 298

证明设置了event之后,rman确实跳过了备份集中的坏块,而且是直接还原了坏块内容,证明了event 19548和19549作用

补充说明
在非特殊情况下强烈不建议设置相关event跳过rman中的坏块来还原/恢复数据库,这样将对数据的丢失,甚至数据库是否可以正常open不好评估,rman备份重要,确保rman备份可用也很重要.

记录因磁盘头被重写,抢救redo恢复经历

客户使用赛门铁克做同城异地容灾部署extent rac,因某种情况导致主备容灾不同步,然后在主库中进行了若干操作,导致主库所有裸设备丢失,然后进行了一些列的操作,主库识别了裸设备,但是oracle出现异常
数据库裸设备异常

Fri May 31 22:07:39 2013
ORA-00202: control file: '/dev/rcontrol2'
ORA-27047: unable to read the header block of file
Additional information: 2
ORA-205 signalled during: ALTER DATABASE   MOUNT...

使用备份还原控制文件后,查询数据文件头v$datafile_header.error全部为”WRONG FILE TYPE”,使用bbed去查看,10个block以内全部是0,证明数据库文件头也损坏。因为客户的数据库虽然有rman备份,但涉及到memory,对redo的信息也很敏感,所以希望能够在他们当前的情况下评估redo是否可以应用,确保他们的数据不丢失.验证文件头

DATA FILE #1:
  (name #4) /dev/rsystem
creation size=128000 block size=8192 status=0xe head=4 tail=4 dup=1
 tablespace 0, index=1 krfil=1 prev_file=0
 unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
 Checkpoint cnt:22469 scn: 0x0000.7b4f9d86 05/29/2013 22:09:50
 Stop scn: 0xffff.ffffffff 05/15/2013 00:08:31
 Creation Checkpointed at scn:  0x0000.00000009 05/20/2007 21:52:41
 thread:1 rba:(0x1.3.10)
 enabled  threads:  01000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000
 Offline scn: 0x0000.00000000 prev_range: 0
 Online Checkpointed at scn:  0x0000.00000000
 thread:0 rba:(0x0.0.0)
 enabled  threads:  00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000
 Hot Backup end marker scn: 0x0000.00000000
 aux_file is NOT DEFINED
File header version cannot be determined due to corruption
Dump may be suspect
 V10 STYLE FILE HEADER:
	Software vsn=0=0x0, Compatibility Vsn=0=0x0
	Db ID=0=0x0, Db Name=''
	Activation ID=0=0x0
	Control Seq=0=0x0, File size=0=0x0
	File Number=0, Blksiz=0, File Type=0 UNKNOWN
Tablespace #0 -   rel_fn:0
Creation   at   scn: 0x0000.00000000 01/01/1988 00:00:00
Backup taken at scn: 0x0000.00000000 01/01/1988 00:00:00 thread:0
 reset logs count:0x0 scn: 0x0000.00000000 reset logs terminal rcv data:0x0 scn: 0x0000.00000000
 prev reset logs count:0x0 scn: 0x0000.00000000 prev reset logs terminal rcv data:0x0 scn: 0x0000.00000000
 recovered at 01/01/1988 00:00:00
 status:0x0 root dba:0x00000000 chkpt cnt: 0 ctl cnt:0
begin-hot-backup file size: 0
Checkpointed at scn:  0x0000.00000000
 thread:0 rba:(0x0.0.0)
 enabled  threads:  00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000
Backup Checkpointed at scn:  0x0000.00000000
 thread:0 rba:(0x0.0.0)
 enabled  threads:  00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000
External cache id: 0x0 0x0 0x0 0x0
Absolute fuzzy scn: 0x0000.00000000
Recovery fuzzy scn: 0x0000.00000000 01/01/1988 00:00:00
Terminal Recovery Stamp  01/01/1988 00:00:00

这里可以看出来文件头完全损坏,dump redo header

LOG FILE #1:
  (name #1) /dev/rredo11
 Thread 1 redo log links: forward: 2 backward: 0
 siz: 0x3c000 seq: 0x00003f9c hws: 0x2 bsz: 512 nab: 0x1763c flg: 0x1 dup: 1
 Archive links: fwrd: 0 back: 0 Prev scn: 0x0000.7b458379
 Low scn: 0x0000.7b4583ad 05/29/2013 01:32:39
 Next scn: 0x0000.7b4fa0e5 05/29/2013 22:17:59
 FILE HEADER:
	Software vsn=0=0x0, Compatibility Vsn=0=0x0
	Db ID=0=0x0, Db Name=''
	Activation ID=0=0x0
	Control Seq=0=0x0, File size=0=0x0
	File Number=0, Blksiz=0, File Type=0 UNKNOWN
 descrip:""
 thread: 0 nab: 0x0 seq: 0x00000000 hws: 0x0 eot: 0 dis: 0
 reset logs count: 0x0 scn: 0x0000.00000000
 Low scn: 0x0000.00000000 01/01/1988 00:00:00
 Next scn: 0x0000.00000000 01/01/1988 00:00:00
 Enabled scn: 0x0000.00000000 01/01/1988 00:00:00
 Thread closed scn: 0x0000.00000000 01/01/1988 00:00:00
 Log format vsn: 0x0 Disk cksum: 0x0 Calc cksum: 0x0
 Terminal Recovery Stop scn: 0x0000.00000000
 Terminal Recovery Stamp  01/01/1988 00:00:00
 Most recent redo scn: 0x0000.00000000
 Largest LWN: 0 blocks
 Miscellaneous flags: 0x0
 Thread internal enable indicator: thr: 0, seq: 0 scn: 0x0000.00000000

验证redo header已经异常,dump redo logfile全部提示文件头错误


现在情况已经很明显,客户的库因为使用了裸设备,online 磁盘的过程中,所有的裸设备卷已经重写了文件头,oracle的datafile header信息不在了,无法正常操作完成.我们决定使用rman备份来恢复该数据库,然后想办法处理redo

注册带库备份集
在rman恢复过程中,我们遇到一个问题,客户的库rman备份策略有问题,一周一个全备,每天一个增量备份,一次归档备份,最后一次增量备份后备份了控制文件,但是最后一次归档备份之后无控制文件,而且是归档的备份发生在增量备份之后,因为是使用了带库无catalog库,我们增量恢复之后,数据不一致需要归档,但是归档,而归档的备份未记录在还原出来的控制文件中,需要人工注册带库的备份集到控制文件中

--涉及到3个节点都有配置不同的NB_ORA_CLIENT=pysa,如果在同一个节点中还原归档日志,需要配置如下
CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' PARMS  'ENV=(NB_ORA_CLIENT=pysa)';
--如果在默认节点直接分配sbt通道即可
configure default device type to sbt_tape;
--注册带库备份集
catalog device type 'sbt_tape' backuppiece 'al_21395_1_816744765';

分析redo
通过ue打开dd出来的redo文件,我们分析得到20000h(10*8192)全部为0,应该是和赛门铁克存储管理系统有关系,后面开始是aix的设备头信息

正常redo文件信息

该库redo信息对比(获得aix偏移量)

对比正常redo起点信息和经验我们定位到aix的设备头偏移量为1000h(4096),整体偏移量为21000h(10*8192+4096)
该库的redo起点为21000h,也就是说,我们需要执行的dd语句为类似语句(redo 大小为120M)

dd if=/dev/vx/rdsk/dg/redo31 bs=512 skip=264 count=245761 of=/arch/xifenfei/redo31

dd出来所有数据后,对先要已经应用了归档的库,继续尝试recover redo

SQL> recover database using backup controlfile;
ORA-00279: change 2069348436 generated at 05/30/2013 16:18:08 needed for thread
1
ORA-00289: suggestion : /arch/1_16289_623109141.dbf
ORA-00280: change 2069348436 for thread 1 is in sequence #16289
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/arch/xifenfei/redo12
ORA-00279: change 2069348436 generated at 05/30/2013 16:18:08 needed for thread
2
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/arch/xifenfei/redo23
ORA-00279: change 2069348436 generated at 05/30/2013 16:18:07 needed for thread
3
ORA-00289: suggestion : /arch/3_3898_623109141.dbf
ORA-00280: change 2069348436 for thread 3 is in sequence #3898
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/arch/xifenfei/redo31
ORA-00279: change 2069455373 generated at 05/30/2013 20:03:26 needed for thread
1
ORA-00289: suggestion : /arch/1_16290_623109141.dbf
ORA-00280: change 2069455373 for thread 1 is in sequence #16290
ORA-00278: log file '/arch/xifenfei/redo12' no longer needed for this recovery
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/arch/xifenfei/redo11
ORA-00279: change 2069475956 generated at 05/30/2013 20:04:09 needed for thread
3
ORA-00289: suggestion : /arch/3_3899_623109141.dbf
ORA-00280: change 2069475956 for thread 3 is in sequence #3899
ORA-00278: log file '/arch/xifenfei/redo31' no longer needed for this recovery
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/arch/xifenfei/redo33
Log applied.
Media recovery complete.
SQL> alter database open resetlogs;
Database altered.

到这一步我们完整的通过dd跳过了由于赛门铁克管理磁盘导致的磁盘头损坏的块,从裸设备中复制出redo到文件系统,然后进行恢复,完整的抢救了客户的数据,减少了客户的损坏.这里温馨提示对于非常重要的系统(涉及钱),强烈建议redo多路冗余,光依靠存储容灾,备份,dg依然不够

通过基表获取segment header block

数据库不能open的时候,可以通过dul挖取相关基表(user$,obj$,ts$,tab$,seg$,file$),从而来获得segment header信息,然后通过dump该block,结合shell脚本获得extents分布脚本来获得extent分布

   SELECT NVL (u.name, 'SYS'),
          o.name,
          o.subname,
          so.object_type,
          s.type#,
          DECODE (BITAND (s.spare1, 2097408),
                  2097152, 'SECUREFILE',
                  256, 'ASSM',
                  'MSSM'),
          ts.ts#,
          ts.name,
          ts.blocksize,
          f.file#,
          s.block#,
          s.blocks * ts.blocksize,
          s.blocks,
          s.extents,
          s.iniexts * ts.blocksize,
          s.extsize * ts.blocksize,
          s.minexts,
          s.maxexts,
          DECODE (BITAND (s.spare1, 4194304), 4194304, bitmapranges, NULL),
          TO_CHAR (
             DECODE (
                BITAND (s.spare1, 2097152),
                2097152, DECODE (s.lists,
                                 0, 'NONE',
                                 1, 'AUTO',
                                 2, 'MIN',
                                 3, 'MAX',
                                 4, 'DEFAULT',
                                 'INVALID'),
                NULL)),
          DECODE (BITAND (s.spare1, 2097152), 2097152, s.groups, NULL),
          DECODE (BITAND (ts.flags, 3), 1, TO_NUMBER (NULL), s.extpct),
          DECODE (BITAND (ts.flags, 32),
                  32, TO_NUMBER (NULL),
                  DECODE (s.lists, 0, 1, s.lists)),
          DECODE (BITAND (ts.flags, 32),
                  32, TO_NUMBER (NULL),
                  DECODE (s.groups, 0, 1, s.groups)),
          s.file#,
          BITAND (s.cachehint, 3),
          BITAND (s.cachehint, 12) / 4,
          BITAND (s.cachehint, 48) / 16,
          NVL (s.spare1, 0),
          o.dataobj#
     FROM chf.user$ u,
          chf.obj$ o,
          chf.ts$ ts,
          ( SELECT DECODE (BITAND (t.property, 8192), 8192, 'NESTED TABLE', 'TABLE') OBJECT_TYPE,
          2 OBJECT_TYPE_ID,
          5 SEGMENT_TYPE_ID,
          t.obj# OBJECT_ID,
          t.file# HEADER_FILE,
          t.block# HEADER_BLOCK,
          t.ts# TS_NUMBER
     FROM chf.tab$ t) so,
          chf.seg$ s,
          chf.file$ f
    WHERE     s.file# = so.header_file
          AND s.block# = so.header_block
          AND s.ts# = so.ts_number
          AND s.ts# = ts.ts#
          AND o.obj# = so.object_id
          AND o.owner# = u.user#(+)
          AND s.type# = so.segment_type_id
          AND o.type# = so.object_type_id
          AND s.ts# = f.ts#
          AND s.file# = f.relfile#
          and o.name in('XIFENFEI','T_XIFENFEI');

shell脚本获得extents分布

比较深入看过dba_extents视图的朋友都知道,它得到extent的信息不是通过普通的存储在数据库中的基表获得,而是x$相关的表获得(x$表是数据库启动时候在内存中创建,不存在数据文件中),因为当数据库未正常启动,我们无法直接确定某个block是否在某个对象中.其实关于extent的信息都已经记录在了segment header的block中,通过dump该block记录的rdba信息,未转化为file_id和block_id,这里写shell脚本实现把segment header dump 内容转化为类似dba_extents记录,方便在某些不能open的库中分析某个异常block是否属于某个表

#! /bin/bash
dec2bin(){
  val_16=$1
  ((num=$val_16));
  val=`echo $num`
  local base=$2
  [ $val -eq 0 ] && bin=0
if [ $val -ge $base ]; then
    dec2bin $val $((base*2))
    if [ $val -ge $base ]; then
      val=$(($val-$base))
      bin=${bin}1
    else
      bin=${bin}0
    fi
  fi
  [ $base -eq 1 ] && printf  $bin
}
for i in `grep "length:" $1 |awk '{print $1 $3}'`;
do
rdba=`echo ${i:0:10}`
blocks=`echo ${i:10}`
echo -n "rdba:"$rdba"    "
bin2=`dec2bin $rdba  1`
len=`expr length $bin2`
len_gd=22
len_jg=`expr $len - $len_gd`
file_no_2=`echo ${bin2:0:$len_jg}`
((file_no=2#$file_no_2))
echo -n "file_id:"$file_no"    "
block_no_2=`echo ${bin2:$len_jg}`
((block_no=2#$block_no_2))
echo -n "block_id:"$block_no"    "
echo  "blocks:"$blocks
done;

trace文件中部分信息

  -----------------------------------------------------------------
   0x00400901  length: 7
   0x00402e10  length: 8
   0x00402e60  length: 8
   0x00402e68  length: 8
   0x00402ea0  length: 8
   0x00402f20  length: 8
   0x00402f48  length: 8
   0x00403050  length: 8
   0x00403180  length: 8
   0x00403b38  length: 8
   0x00404c48  length: 8
   0x00404c78  length: 8
   0x00404cf8  length: 8
   0x00404da8  length: 8
   0x00404db8  length: 8
   0x00404de8  length: 8
   0x00404e80  length: 128
   0x00405900  length: 128
   0x00406500  length: 128
   0x00406980  length: 128
   0x00407480  length: 128
   0x00407500  length: 128
   0x00407680  length: 128
   0x00407800  length: 128
   0x00407880  length: 128
   0x00407a00  length: 128
   0x00407a80  length: 128
   0x00407c80  length: 128
…………

执行结果

[oracle@xifenfei tmp]$ ./get_extent.sh /u01/app/oracle/diag/rdbms/cdb/cdb/trace/cdb_ora_29565.trc
rdba:0x00400901    file_id:1    block_id:2305     blocks:7
rdba:0x00402e10    file_id:1    block_id:11792     blocks:8
rdba:0x00402e60    file_id:1    block_id:11872     blocks:8
rdba:0x00402e68    file_id:1    block_id:11880     blocks:8
rdba:0x00402ea0    file_id:1    block_id:11936     blocks:8
rdba:0x00402f20    file_id:1    block_id:12064     blocks:8
rdba:0x00402f48    file_id:1    block_id:12104     blocks:8
rdba:0x00403050    file_id:1    block_id:12368     blocks:8
rdba:0x00403180    file_id:1    block_id:12672     blocks:8
rdba:0x00403b38    file_id:1    block_id:15160     blocks:8
rdba:0x00404c48    file_id:1    block_id:19528     blocks:8
rdba:0x00404c78    file_id:1    block_id:19576     blocks:8
rdba:0x00404cf8    file_id:1    block_id:19704     blocks:8
rdba:0x00404da8    file_id:1    block_id:19880     blocks:8
rdba:0x00404db8    file_id:1    block_id:19896     blocks:8
rdba:0x00404de8    file_id:1    block_id:19944     blocks:8
rdba:0x00404e80    file_id:1    block_id:20096     blocks:128
rdba:0x00405900    file_id:1    block_id:22784     blocks:128
rdba:0x00406500    file_id:1    block_id:25856     blocks:128
rdba:0x00406980    file_id:1    block_id:27008     blocks:128
rdba:0x00407480    file_id:1    block_id:29824     blocks:128
rdba:0x00407500    file_id:1    block_id:29952     blocks:128
rdba:0x00407680    file_id:1    block_id:30336     blocks:128
rdba:0x00407800    file_id:1    block_id:30720     blocks:128
…………

记录一次ORA-600 4000数据库故障恢复

ORA-600[4000]错误
一朋友数据库因为当前redo丢失,在恢复的过程中启动报ORA-600[4000]错误

SMON: enabling cache recovery
Thu May 30 16:24:17 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/udump/xifenfei1_ora_1458370.trc:
ORA-00600: internal error code, arguments: [4000], [83], [], [], [], [], [], []
Thu May 30 16:24:19 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/udump/xifenfei1_ora_1458370.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [83], [], [], [], [], [], []
Thu May 30 16:24:19 2013
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1458370
ORA-1092 signalled during: alter database open resetlogs...

分析trace文件

*** 2013-05-30 16:24:17.979
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4000], [83], [], [], [], [], [], []
Current SQL statement for this session:
select ctime, mtime, stime from obj$ where obj# = :1
--确定是obj$对象异常,通过某种手段找到obj$的objid和dataobjid均为16,对应16进制为12
Block header dump:  0x0040007a
 Object id on Block? Y
 seg/obj: 0x12  csc: 0xc1e.a329e76f  itc: 1  flg: -  typ: 1 - DATA
     fsl: 0  fnx: 0x0 ver: 0x01
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0053.02a.000598bd  0x0d407e46.4f52.2f  --U-    1  fsc 0x0000.a329e772

这里比较明显obj$对象在rdba为0040007a的block上,scn为0c1e.a329e76f(13325725984623)且未提交的事务,这样的现象就决定了处理的特殊性(不是因为块延迟清理导致访问undo现象,该现象直接推进scn解决,而该情况不行)

数据文件头scn

SQL> SELECT DISTINCT CHECKPOINT# FROM V$DATAFILE_HEADER;
    CHECKPOINT_CHANGE#
-------------------------
           13324676536960

bbed查看文件头scn

   struct kcvfhckp, 160 bytes               @484
      struct kcvcpscn, 8 bytes              @484
         ub4 kscnbas                        @484      0x649c9a80
         ub2 kscnwrp                        @488      0x0c1e

这里看到的文件头scn也是为13324676536960(0c1e.649c9a80)和sql查询结果一致,也就是说数据库中的obj$的某个对象含有事务,且scn大于文件头scn(因为当前redo丢失,无法前滚,所以出现该情况),当数据库访问obj$的时候,为了事务的一致性,就需要访问undo(这里提示为83 回滚段),而undo异常,所以smon进程回滚失败,数据库无法正常启动

使用bbed提交事务

BBED> map
 File: /oradata/sys/xifenfei/system01.dbf (1)
 Block: 122                                   Dba:0x0040007a
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 48 bytes                     @20
 struct kdbh, 14 bytes                      @68
 struct kdbt[1], 4 bytes                    @82
 sb2 kdbr[108]                              @86
 ub1 freespace[802]                         @302
 ub1 rowdata[7084]                          @1104
 ub4 tailchk                                @8188
BBED> p ktbbh
struct ktbbh, 48 bytes                      @20
   ub1 ktbbhtyp                             @20       0x01 (KDDBTDATA)
   union ktbbhsid, 4 bytes                  @24
      ub4 ktbbhsg1                          @24       0x00000012
      ub4 ktbbhod1                          @24       0x00000012
   struct ktbbhcsc, 8 bytes                 @28
      ub4 kscnbas                           @28       0xa329e76f
      ub2 kscnwrp                           @32       0x0c1e
   b2 ktbbhict                              @36       1
   ub1 ktbbhflg                             @38       0x02 (NONE)
   ub1 ktbbhfsl                             @39       0x00
   ub4 ktbbhfnx                             @40       0x00000000
   struct ktbbhitl[0], 24 bytes             @44
      struct ktbitxid, 8 bytes              @44
         ub2 kxidusn                        @44       0x0053
         ub2 kxidslt                        @46       0x002a
         ub4 kxidsqn                        @48       0x000598bd
      struct ktbituba, 8 bytes              @52
         ub4 kubadba                        @52       0x0d407e46
         ub2 kubaseq                        @56       0x4f52
         ub1 kubarec                        @58       0x2f
      ub2 ktbitflg                          @60       0x2001 (KTBFUPB)<--需要提交
      union _ktbitun, 2 bytes               @62
         b2 _ktbitfsc                       @62       0
         ub2 _ktbitwrp                      @62       0x0000
      ub4 ktbitbas                          @64       0xa329e772
BBED> set count 32
        COUNT           32
BBED> set offset 60
        OFFSET          60
BBED> d
 File: /oradata/sys/xifenfei/system01.dbf (1)
 Block: 122              Offsets:   60 to   91           Dba:0x0040007a
------------------------------------------------------------------------
 20010000 a329e772 0001006c ffff00ea 040c0368 03680000 006c1f7c 1f3c1efb
 <32 bytes per line>
BBED> m /x 8001
 File: /oradata/sys/xifenfei/system01.dbf (1)
 Block: 122              Offsets:   60 to   91           Dba:0x0040007a
------------------------------------------------------------------------
 80010000 a329e772 0001006c ffff00ea 040c0368 03680000 006c1f7c 1f3c1efb
 <32 bytes per line>
BBED> sum apply
Check value for File 1, Block 122:
current = 0xafd6, required = 0xafd6

尝试open数据库ORA-600[2662]解决

Thu May 30 21:16:00 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/bdump/xifenfei1_smon_819664.trc:
ORA-00600: internal error code, arguments:[2662],[3102],[2737532996],[3102],[2745973074],[4194397],[],[]
Non-fatal internal error happenned while SMON was doing non-existent object cleanup.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Thu May 30 21:16:01 2013
Trace dumping is performing id=[cdmp_20130530211601]
Thu May 30 21:16:02 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/bdump/xifenfei1_smon_819664.trc:
ORA-00600: internal error code, arguments:[2662],[3102],[2737532997],[3102],[2745973074],[4194397],[],[]
Thu May 30 21:16:03 2013
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 2 out of maximum 100 non-fatal internal errors.
Thu May 30 21:16:05 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/bdump/xifenfei1_smon_819664.trc:
ORA-00600: internal error code, arguments:[2662],[3102],[2737532997],[3102],[2745973074],[4194397],[],[]
Thu May 30 21:16:08 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/bdump/xifenfei1_pmon_958764.trc:
ORA-00474: SMON process terminated with error
Thu May 30 21:16:08 2013
PMON: terminating instance due to error 474

数据库在open过程中遇到大量ORA-00600[2662],这个是因为数据库中文件头的scn小于访问的数据块scn导致该问题,解决方法推荐scn,如果数据库的scn本身就很大(和时间理论scn较接近),推进过程中可能遇到如下错误,这个时候就需要选择合适的方法/合适的值来推进scn

SQL> startup pfile=/home/oracle/pfile force
ORACLE instance started.
Total System Global Area 5.5835E+10 bytes
Fixed Size                  2177056 bytes
Variable Size            3.2867E+10 bytes
Database Buffers         2.2951E+10 bytes
Redo Buffers               14598144 bytes
Database mounted.
ORA-01052: required destination LOG_ARCHIVE_DUPLEX_DEST is not specified

后面的工作因为没有redo前滚,而且该库故障时有大量事务在跑,现在无法前滚,导致大量的undo回滚段异常,index和data不一致等故障,需要做的就是屏蔽undo seg,重建undo,重建库

Viewing Information About CDBs and PDBs

在ORACLE 12C中引入了CDB和PDB的概念,实现了ORACLE数据库的可插拔,在一个CDB数据库中,有多个PDB,而每一个PDB又可以理解为一个独立的传统ORACLE 数据库,那为了能够通过一个sql查询获得整个CDB数据库的信息,ORALCE 引入了CDB_*开头的视图,该视图就是在传统的DBA_*视图基础之上增加了CON_ID,用来区分不同的PDB,从而实现了一个简单sql查询在有足够权限的情况下,可以查询所有PDB中信息
ORACLE 12C版本

SQL> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0
PL/SQL Release 12.1.0.1.0 - Production                                                    0
CORE    12.1.0.1.0      Production                                                        0
TNS for Linux: Version 12.1.0.1.0 - Production                                            0
NLSRTL Version 12.1.0.1.0 - Production                                                    0

确定是否是CDB

SQL> SELECT CDB FROM V$DATABASE;
CDB
---
YES

YES表示该数据库是CDB,如果是NO表示是NO-CDB(普通数据库)

CDB中各容器信息

SQL> COLUMN NAME FORMAT A8
SQL> SELECT NAME, CON_ID, DBID, CON_UID, GUID FROM V$CONTAINERS ORDER BY CON_ID;
NAME         CON_ID       DBID    CON_UID GUID
-------- ---------- ---------- ---------- --------------------------------
CDB$ROOT          1 1922813718          1 DB85A3D39F8E7703E0431CAAE80A8C44
PDB$SEED          2 4048821679 4048821679 D49B80694E4A449BE0430100007F906F
PDB1              3 3313918585 3313918585 D49BF37938FB4C10E0430100007F6CBD
PDB2              4 3872456618 3872456618 D49BFA33332F4C3EE0430100007FA059
SQL> select CON_NAME_TO_ID('PDB1') FROM DUAL;
CON_NAME_TO_ID('PDB1')
----------------------
                     3
SQL> SELECT CON_DBID_TO_ID(3313918585) FROM DUAL;
CON_DBID_TO_ID(3313918585)
--------------------------
                         3
SQL> SELECT CON_UID_TO_ID(3313918585) FROM DUAL;
CON_UID_TO_ID(3313918585)
-------------------------
                        3

PDB部分信息

SQL> COLUMN PDB_NAME FORMAT A15
SQL> SELECT PDB_ID, PDB_NAME, STATUS FROM DBA_PDBS ORDER BY PDB_ID;
    PDB_ID PDB_NAME        STATUS
---------- --------------- -------------
         2 PDB$SEED        NORMAL
         3 PDB1            NORMAL
         4 PDB2            NORMAL
SQL> COLUMN NAME FORMAT A15
SQL> COLUMN RESTRICTED FORMAT A10
SQL> COLUMN OPEN_TIME FORMAT A30
SQL> SELECT NAME, OPEN_MODE, RESTRICTED, OPEN_TIME FROM V$PDBS;
NAME            OPEN_MODE  RESTRICTED OPEN_TIME
--------------- ---------- ---------- ------------------------------
PDB$SEED        READ ONLY  NO         12-MAY-13 08.51.53.177 AM
PDB1            READ WRITE NO         12-MAY-13 09.35.48.646 AM
PDB2            MOUNTED               12-MAY-13 08.56.59.859 AM

CDB中查询对象信息

SQL> COLUMN PDB_NAME FORMAT A15
SQL> COLUMN OWNER FORMAT A15
SQL> COLUMN TABLE_NAME FORMAT A30
SQL> SELECT p.PDB_ID, p.PDB_NAME, t.OWNER, t.TABLE_NAME
  2  FROM DBA_PDBS p, CDB_TABLES t
  3  WHERE p.PDB_ID > 2
  4  AND T.TABLE_NAME='COL$'
  5  AND p.PDB_ID = t.CON_ID
  6  ORDER BY p.PDB_ID;
    PDB_ID PDB_NAME        OWNER           TABLE_NAME
---------- --------------- --------------- ------------------------------
         3 PDB1            SYS             COL$
         4 PDB2            SYS             COL$

查询在CDB中的PDB数据/临时文件信息

SQL> COLUMN PDB_ID FORMAT 999
SQL> COLUMN PDB_NAME FORMAT A8
SQL> COLUMN FILE_ID FORMAT 9999
SQL> COLUMN TABLESPACE_NAME FORMAT A10
SQL> COLUMN FILE_NAME FORMAT A45
SQL> SELECT p.PDB_ID, p.PDB_NAME, d.FILE_ID, d.TABLESPACE_NAME, d.FILE_NAME
  2  FROM DBA_PDBS p, CDB_DATA_FILES d
  3  WHERE p.PDB_ID = d.CON_ID
  4  ORDER BY p.PDB_ID;
PDB_ID PDB_NAME FILE_ID TABLESPACE FILE_NAME
------ -------- ------- ---------- ---------------------------------------------
     2 PDB$SEED       5 SYSTEM     +DATA/pdb$seed_system01.dbf
     2 PDB$SEED       7 SYSAUX     +DATA/pdb$seed_sysaux01.dbf
     3 PDB1           9 SYSAUX     +DATA/pdb1_pdb$seed_sysaux01.dbf
     3 PDB1          10 USERS      +DATA/cdb/pdb1_users01.dbf
     3 PDB1           8 SYSTEM     +DATA/pdb1_pdb$seed_system01.dbf
     4 PDB2          13 USERS      +DATA/cdb/pdb2_users01.dbf
     4 PDB2          12 SYSAUX     +DATA/pdb2_pdb$seed_sysaux01.dbf
     4 PDB2          11 SYSTEM     +DATA/pdb2_pdb$seed_system01.dbf
8 rows selected.
SQL> COLUMN CON_ID FORMAT 999
SQL> COLUMN FILE_ID FORMAT 9999
SQL> COLUMN TABLESPACE_NAME FORMAT A15
SQL> COLUMN FILE_NAME FORMAT A45
SQL> SELECT CON_ID, FILE_ID, TABLESPACE_NAME, FILE_NAME
  2  FROM CDB_TEMP_FILES
  3  ORDER BY CON_ID;
CON_ID FILE_ID TABLESPACE_NAME FILE_NAME
------ ------- --------------- ---------------------------------------------
     1       1 TEMP            +DATA/cdb/temp01.dbf
     2       2 TEMP            +DATA/pdbseed_temp01.dbf
     3       3 TEMP            +DATA/pdb1_temp01.dbf
     4       4 TEMP            +DATA/pdb2_temp01.dbf

查询PDB的service信息

SQL> COLUMN NETWORK_NAME FORMAT A30
SQL> COLUMN PDB FORMAT A15
SQL> COLUMN CON_ID FORMAT 999
SQL> SELECT PDB, NETWORK_NAME, CON_ID FROM CDB_SERVICES
  2  WHERE PDB IS NOT NULL AND
  3  CON_ID > 2
  4  ORDER BY PDB;
PDB             NETWORK_NAME                   CON_ID
--------------- ------------------------------ ------
PDB1            pdb1                                3
PDB2            pdb2                                4

PDB中可以修改参数

SELECT NAME FROM V$SYSTEM_PARAMETER
WHERE ISPDB_MODIFIABLE = 'TRUE'
ORDER BY NAME;

查看PDB历史信息

SQL> COLUMN DB_NAME FORMAT A10
SQL> COLUMN CON_ID FORMAT 999
SQL> COLUMN PDB_NAME FORMAT A15
SQL> COLUMN OPERATION FORMAT A16
SQL> COLUMN OP_TIMESTAMP FORMAT A10
SQL> COLUMN CLONED_FROM_PDB_NAME FORMAT A15
SQL> SELECT DB_NAME, CON_ID, PDB_NAME, OPERATION, OP_TIMESTAMP, CLONED_FROM_PDB_NAME
  2  FROM CDB_PDB_HISTORY
  3  WHERE CON_ID > 2
  4  ORDER BY CON_ID;
DB_NAME    CON_ID PDB_NAME        OPERATION        OP_TIMESTA CLONED_FROM_PDB
---------- ------ --------------- ---------------- ---------- ---------------
SEEDDATA        3 PDB$SEED        UNPLUG           29-APR-13
CDB             3 PDB1            CREATE           01-FEB-13  PDB$SEED
CDB             3 PDB$SEED        PLUG             01-FEB-13  PDB$SEED
SEEDDATA        4 PDB$SEED        UNPLUG           29-APR-13
CDB             4 PDB2            CREATE           01-FEB-13  PDB$SEED
CDB             4 PDB$SEED        PLUG             01-FEB-13  PDB$SEED
6 rows selected.

ORACLE 12C PDB 维护基础介绍

CDB和PDB是ORACLE 12C一个很亮的新特性,由于他们的引入导致传统的ORACLE数据库管理理念不少发生了改变,这里列举了部分最基本的cdb和pdb管理方式
cdb和pdb关系图

ORACLE 12C版本

SQL> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0
PL/SQL Release 12.1.0.1.0 - Production                                                    0
CORE    12.1.0.1.0      Production                                                        0
TNS for Linux: Version 12.1.0.1.0 - Production                                            0
NLSRTL Version 12.1.0.1.0 - Production                                                    0

启动关闭pdb

SQL> startup
ORACLE instance started.
Total System Global Area  597098496 bytes
Fixed Size                  2291072 bytes
Variable Size             272632448 bytes
Database Buffers          314572800 bytes
Redo Buffers                7602176 bytes
Database mounted.
Database opened.
SQL> select con_id,dbid,NAME,OPEN_MODE from v$pdbs;
    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         2 4048821679 PDB$SEED                       READ ONLY
         3 3313918585 PDB1                           MOUNTED
         4 3872456618 PDB2                           MOUNTED
SQL> alter PLUGGABLE database pdb1 open;
Pluggable database altered.
SQL> select con_id,dbid,NAME,OPEN_MODE from v$pdbs;
    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         2 4048821679 PDB$SEED                       READ ONLY
         3 3313918585 PDB1                           READ WRITE
         4 3872456618 PDB2                           MOUNTED
SQL> alter PLUGGABLE database pdb1 close;
Pluggable database altered.
SQL> select con_id,dbid,NAME,OPEN_MODE from v$pdbs;
    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         2 4048821679 PDB$SEED                       READ ONLY
         3 3313918585 PDB1                           MOUNTED
         4 3872456618 PDB2                           MOUNTED
SQL> alter PLUGGABLE database all open;
Pluggable database altered.
SQL> select con_id,dbid,NAME,OPEN_MODE from v$pdbs;
    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         2 4048821679 PDB$SEED                       READ ONLY
         3 3313918585 PDB1                           READ WRITE
         4 3872456618 PDB2                           READ WRITE
SQL> alter PLUGGABLE database all close;
Pluggable database altered.
SQL> select con_id,dbid,NAME,OPEN_MODE from v$pdbs;
    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         2 4048821679 PDB$SEED                       READ ONLY
         3 3313918585 PDB1                           MOUNTED
         4 3872456618 PDB2                           MOUNTED
SQL> alter session set container=pdb1;
Session altered.
SQL> startup
Pluggable Database opened.
SQL>  select con_id,dbid,NAME,OPEN_MODE from v$pdbs;
    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         3 3313918585 PDB1                           READ WRITE

pdb的管理可以在cdb中进行也可以在pdb中进行,如果是cdb中进行,需要PLUGGABLE关键字,如果是pdb中直接和普通数据库一样

登录pdb

[oracle@xifenfei ~]$ lsnrctl status
LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 12-MAY-2013 08:07:02
Copyright (c) 1991, 2013, Oracle.  All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=xifenfei)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                11-MAY-2013 18:30:54
Uptime                    0 days 13 hr. 36 min. 8 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/grid/product/12.1/network/admin/listener.ora
Listener Log File         /u01/app/grid/diag/tnslsnr/xifenfei/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xifenfei)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=xifenfei)(PORT=5500))
(Security=(my_wallet_directory=/u01/oracle/12.1/db_1/admin/cdb/xdb_wallet))(Presentation=HTTP)(Session=RAW))
Services Summary...
Service "+ASM" has 1 instance(s).
  Instance "+ASM", status READY, has 1 handler(s) for this service...
Service "cdb" has 1 instance(s).
  Instance "cdb", status READY, has 1 handler(s) for this service...
Service "cdbXDB" has 1 instance(s).
  Instance "cdb", status READY, has 1 handler(s) for this service...
Service "pdb1" has 1 instance(s).
  Instance "cdb", status READY, has 1 handler(s) for this service...
Service "pdb2" has 1 instance(s).
  Instance "cdb", status READY, has 1 handler(s) for this service...
The command completed successfully
[oracle@xifenfei ~]$ tnsping pdb1
TNS Ping Utility for Linux: Version 12.1.0.1.0 - Production on 12-MAY-2013 08:07:09
Copyright (c) 1997, 2013, Oracle.  All rights reserved.
Used parameter files:
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = xifenfei)
(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = pdb1)))
OK (20 msec)
[oracle@xifenfei ~]$ sqlplus sys/xifenfei@pdb1 as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Sun May 12 08:08:02 2013
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options
SQL> show con_name;
CON_NAME
------------------------------
PDB1
[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Sun May 12 08:09:14 2013
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options
SQL> alter session set container=pdb1;
Session altered.
SQL> show con_name;
CON_NAME
------------------------------
PDB1

pdb可以通过alter session container进入也可以直接通过tns方式登录

创建用户

SQL> show con_name;
CON_NAME
------------------------------
CDB$ROOT
SQL> select con_id,dbid,NAME,OPEN_MODE from v$pdbs;
    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         2 4048821679 PDB$SEED                       READ ONLY
         3 3313918585 PDB1                           READ WRITE
         4 3872456618 PDB2                           MOUNTED
SQL> create user xff identified by xifenfei;
create user xff identified by xifenfei
            *
ERROR at line 1:
ORA-65096: invalid common user or role name
SQL> !oerr ora 65096
65096, 00000, "invalid common user or role name"
// *Cause:  An attempt was made to create a common user or role with a name
//          that wass not valid for common users or roles.  In addition to
//          the usual rules for user and role names, common user and role
//          names must start with C## or c## and consist only of ASCII
//          characters.
// *Action: Specify a valid common user or role name.
//
SQL> create user c##xff identified by xifenfei;
User created.
SQL> SELECT USERNAME,CON_ID,USER_ID FROM CDB_USERS WHERE USERNAME='C##XFF';
USERNAME       CON_ID    USER_ID
---------- ---------- ----------
C##XFF              1        103
C##XFF              3        104
SQL> alter session set container=pdb1;
Session altered.
SQL> show con_name
CON_NAME
------------------------------
PDB1
SQL> create user xff identified by xifenfei;
User created.
SQL> create user c##abc identified by xifenfei;
create user c##abc identified by xifenfei
            *
ERROR at line 1:
ORA-65094: invalid local user or role name

创建用户默认的是container=all,在cdb中只能创建全局用户(c##开头),会在cdb和所有的pdb中创建该用户(但是pdb中的全局用户需要另外授权才能够在pdb中访问)。在pdb中只能创建的用户为本地用户

用户授权

SQL> grant connect to c##xff;
Grant succeeded.
SQL> select GRANTEE,con_id from cdb_ROLE_PRIVS where GRANTED_ROLE='CONNECT' AND GRANTEE='C##XFF';
GRANTEE                            CON_ID
------------------------------ ----------
C##XFF                                  1
SQL> grant resource to c##xff container=all;
Grant succeeded.
SQL>  select GRANTEE,con_id from cdb_ROLE_PRIVS where GRANTED_ROLE='RESOURCE' AND  GRANTEE='C##XFF';
GRANTEE                            CON_ID
------------------------------ ----------
C##XFF                                  1
C##XFF                                  3

用户授权默认情况下是只会给当前container,在cdb中也可以指定container=all,对所有open的pdb且存在该用户都进行授权

修改参数

SQL> alter system set open_cursors=500 container=all;
System altered.
SQL> conn sys/xifenfei@pdb1 as sysdba
Connected.
SQL> show parameter open_cursors;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
open_cursors                         integer     500
SQL>  alter system set open_cursors=100;
 alter system set open_cursors=100
*
ERROR at line 1:
ORA-01219: database or pluggable database not open: queries allowed on fixed
tables or views only
SQL> alter database open;
Database altered.
SQL> alter system set open_cursors=100;
System altered.
SQL>  show parameter open_cursors;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
open_cursors                         integer     100
SQL> conn / as sysdba
Connected.
SQL> show parameter open_cursors;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
open_cursors                         integer     500

这里可以看到在cdb中修改,pdb会继承进去;如果在pdb中修改会覆盖pdb从cdb中继承的参数含义

乱用_allow_resetlogs_corruption参数导致悲剧

一个朋友11.2.0.1的数据库因为断电,出现不能正常open问题,自己尝试恢复,折腾了几天,最后让我帮忙的时候错误如下

SQL> startup
ORACLE 例程已经启动。
Total System Global Area  778387456 bytes
Fixed Size                  1374808 bytes
Variable Size             545260968 bytes
Database Buffers          226492416 bytes
Redo Buffers                5259264 bytes
数据库装载完毕。
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 225)
ORA-01110: data file 1: 'F:\APP\ADMINISTRATOR\ORADATA\YFCLOUD\SYSTEM01.DBF'
进程 ID: 5964
会话 ID: 1144 序列号: 5

从启动的日志提示看初步判断就是悲剧了,因为根据经验值在11gr2版本中,该错误就是undo$(分析trace文件进步一确定是undo$),该block出现异常,数据库在启动的时候要扫描该表,把相关的回滚段给online起来,现在他异常了,数据库肯定无法正常启动
dbv检查数据库文件

F:\>dbv file='F:\APP\ADMINISTRATOR\ORADATA\YFCLOUD\SYSTEM01.DBF'
DBVERIFY: Release 11.2.0.1.0 - Production on 星期三 5月 22 11:06:00 2013
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - 开始验证: FILE = F:\APP\ADMINISTRATOR\ORADATA\YFCLOUD\SYSTEM01.DBF
页 225 流入 - 很可能是介质损坏
Corrupt block relative dba: 0x004000e1 (file 1, block 225)
Fractured block found during dbv:
Data in bad block:
 type: 6 format: 2 rdba: 0x004000e1
 last change scn: 0x0000.00d65120 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xb98e0601
 check value in block header: 0xb307
 computed block checksum: 0xe8ae
DBVERIFY - 验证完成
检查的页总数: 134400
处理的页总数 (数据): 98226
失败的页总数 (数据): 0
处理的页总数 (索引): 14189
失败的页总数 (索引): 0
处理的页总数 (其他): 4178
处理的总页数 (段)  : 1
失败的总页数 (段)  : 0
空的页总数: 17806
标记为损坏的总页数: 1
流入的页总数: 1
加密的总页数        : 0
最高块 SCN            : 14045769 (0.14045769)

看到这里,可以确定坏块的存在,根据上面的提示,我们发现tailchk值不正确,应该是5120+06+01,而不该是b98e0601,通过bbed查看

BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x004000e1
   ub4 bas_kcbh                             @8        0x00d65120
   ub2 wrp_kcbh                             @12       0x0000
   ub1 seq_kcbh                             @14       0x01
   ub1 flg_kcbh                             @15       0x06 (KCBHFDLC, KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x5ba9
   ub2 spare3_kcbh                          @18       0x0000
BBED> p tailchk
ub4 tailchk                                 @8188     0xb98e0601

进一步证明是tailchk异常导致,分析alert日志,数据库异常断电,然后启动的时候发现如下错误

Recovery of Online Redo Log: Thread 1 Group 2 Seq 431 Reading mem 0
  Mem# 0: F:\APP\ADMINISTRATOR\ORADATA\YFCLOUD\REDO02.LOG
RECOVERY OF THREAD 1 STUCK AT BLOCK 451 OF FILE 3
Aborting crash recovery due to error 1172
Errors in file f:\app\administrator\diag\rdbms\yfcloud\yfcloud\trace\yfcloud_ora_772.trc:
ORA-01172: 线程 1 的恢复停止在块 451 (在文件 3 中)
ORA-01151: 如果需要, 请使用介质恢复以恢复块和还原备份
Errors in file f:\app\administrator\diag\rdbms\yfcloud\yfcloud\trace\yfcloud_ora_772.trc:
ORA-01172: 线程 1 的恢复停止在块 451 (在文件 3 中)
ORA-01151: 如果需要, 请使用介质恢复以恢复块和还原备份
ORA-1172 signalled during: alter database open...
Tue May 21 14:27:29 2013
ALTER DATABASE RECOVER  datafile 3
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 2 Seq 431 Reading mem 0
  Mem# 0: F:\APP\ADMINISTRATOR\ORADATA\YFCLOUD\REDO02.LOG
Errors in file f:\app\administrator\diag\rdbms\yfcloud\yfcloud\trace\yfcloud_ora_772.trc  (incident=112164):
ORA-00600: 内部错误代码, 参数: [3020], [3], [451], [12583363], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 451, file offset is 3694592 bytes)
ORA-10564: tablespace UNDOTBS1
ORA-01110: 数据文件 3: 'F:\APP\ADMINISTRATOR\ORADATA\YFCLOUD\UNDOTBS01.DBF'
ORA-10560: block type 'KTU UNDO BLOCK'
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 3  ...

因为file# 3, block# 451和redo信息不一致,出现ora-600[3020]错误,而file# 3为undo文件,朋友从而设置undo_management=’manual’并设置了_allow_resetlogs_corruption=true,然后进行不完全恢复,从而出现了如下错误提示

Tue May 21 14:41:23 2013
SMON: enabling cache recovery
Corrupt block relative dba: 0x004000e1 (file 1, block 225)
Fractured block found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004000e1
 last change scn: 0x0000.00d65120 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xb98e0601
 check value in block header: 0xb307
 computed block checksum: 0xe8ae
Reading datafile 'F:\APP\ADMINISTRATOR\ORADATA\YFCLOUD\SYSTEM01.DBF'
for corruption at rdba: 0x004000e1 (file 1, block 225)
Reread (file 1, block 225) found same corrupt data
Errors in file f:\app\administrator\diag\rdbms\yfcloud\yfcloud\trace\yfcloud_ora_4892.trc  (incident=120165):
ORA-01578: ORACLE 数据块损坏 (文件号 1, 块号 225)
ORA-01110: 数据文件 1: 'F:\APP\ADMINISTRATOR\ORADATA\YFCLOUD\SYSTEM01.DBF'
Errors in file f:\app\administrator\diag\rdbms\yfcloud\yfcloud\trace\yfcloud_ora_4892.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01578: ORACLE 数据块损坏 (文件号 1, 块号 225)
ORA-01110: 数据文件 1: 'F:\APP\ADMINISTRATOR\ORADATA\YFCLOUD\SYSTEM01.DBF'
Errors in file f:\app\administrator\diag\rdbms\yfcloud\yfcloud\trace\yfcloud_ora_4892.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01578: ORACLE 数据块损坏 (文件号 1, 块号 225)
ORA-01110: 数据文件 1: 'F:\APP\ADMINISTRATOR\ORADATA\YFCLOUD\SYSTEM01.DBF'
Error 604 happened during db open, shutting down database
USER (ospid: 4892): terminating the instance due to error 604

从而的原因基本上可以从操作过程中了解到:数据库是因为file# 3 block# 451和redo不一致导致问题,而恢复的操作人员冲动的使用了_allow_resetlogs_corruption参数,从而使得数据库出现了不一致性,也就是导致file# 1 block# 225坏块的根本原因,针对这样的情况,完全没有到使用_allow_resetlogs_corruption隐含参数地步

使用bbed修改tailchk

BBED> p tailchk
ub4 tailchk                                 @8188     0xb98e0601
BBED> verify
DBVERIFY - Verification starting
FILE = system01.dbf
BLOCK = 225
Block 225 is corrupt
***
Corrupt block relative dba: 0x004000e1 (file 0, block 225)
Fractured block found during verification
Data in bad block -
 type: 6 format: 2 rdba: 0x004000e1
 last change scn: 0x0000.00d65120 seq: 0x1 flg: 0x06
 consistency value in tail: 0xb98e0601
 check value in block header: 0x5ba9, computed block checksum: 0x0
 spare1: 0x0, spare2: 0x0, spare3: 0x0
***
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
BBED> m /x 01062051
 File: system01.dbf (0)
 Block: 226              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 01062051
 <32 bytes per line>
BBED> p tailchk
ub4 tailchk                                 @8188     0x51200601
BBED> sum apply
Check value for File 0, Block 226:
current = 0xb307, required = 0xb307
BBED> verify
DBVERIFY - Verification starting
FILE = system01.dbf
BLOCK = 225
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0

bbed修改block之后,数据库直接正常打开,完成数据库恢复任务,在这里很明显是因为错误的使用了_allow_resetlogs_corruption参数,屏蔽了redo前滚导致了相关的坏块,所以大家在数据库异常恢复的时候,需要知道各个参数的意义,而不要乱使用,很可能导致不可控结果

ksuapc : ORA-1033 foreground process starts before PMON

在11.2.0.1数据库中启动出现ksuapc : ORA-1033 foreground process starts before PMON错误

Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on.
IMODE=BR
ILAT =27
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
Using parameter settings in server-side spfile /opt/oracle/product/11.2.0/dbhome_1/dbs/spfileora11bak.ora
System parameters with non-default values:
  processes                = 150
  nls_language             = "SIMPLIFIED CHINESE"
  nls_territory            = "CHINA"
  sga_target               = 1536M
  control_files            = "/opt/oracle/oradata/ora11bak/control01.ctl"
  control_files            = "/opt/oracle/flash_recovery_area/ora11bak/control02.ctl"
  db_block_size            = 8192
  compatible               = "11.2.0.0.0"
  db_recovery_file_dest    = "/opt/oracle/flash_recovery_area"
  db_recovery_file_dest_size= 3882M
  undo_tablespace          = "UNDOTBS1"
  remote_login_passwordfile= "EXCLUSIVE"
  db_domain                = ""
  dispatchers              = "(PROTOCOL=TCP) (SERVICE=ora11bakXDB)"
  audit_file_dest          = "/opt/oracle/admin/ora11bak/adump"
  audit_trail              = "DB"
  db_name                  = "ora11bak"
  open_cursors             = 300
  pga_aggregate_target     = 1595M
  diagnostic_dest          = "/opt/oracle"
Fri May 17 05:03:15 2013
ksuapc : ORA-1033 foreground process starts before PMON
Fri May 17 05:03:15 2013
ksuapc : ORA-1033 foreground process starts before PMON
Fri May 17 05:03:16 2013
ksuapc : ORA-1033 foreground process starts before PMON
Fri May 17 05:03:16 2013
ksuapc : ORA-1033 foreground process starts before PMON
Fri May 17 05:03:16 2013
ksuapc : ORA-1033 foreground process starts before PMON
Fri May 17 05:03:16 2013
ksuapc : ORA-1033 foreground process starts before PMON
…………

该错误的原因是数据库在启动过程中有前台进程连接数据库导致,该现象是数据库bug 8991997,该bug影响版本为:11.2.0.1/11.1.0.7,在11.2.0.1.1开始修复