asm disk误设置pvid导致asm diskgroup无法mount恢复

有朋友找到我说他们把以前存储到AIX直连的存储切换为含光纤交换机的存储网络后,RAC无法启动,让我给予支持.通过分析是由于换链路之后开始磁盘顺序不对,维护人员对其asm disk 设置了pvid,导致asm 磁盘组无法正常mount,从而使得含votedisk的dg的asm disk无法正常访问,从而RAC的cssd进程无法启动,同样数据文件的磁盘组也无法mount,通过kfed修复成功,实现数据0丢失.
平台版本信息(2节点RAC)

$ sqlplus -v
SQL*Plus: Release 11.2.0.4.0 Production
$ uname -a
AIX db2 1 7 00F9733E4C00

GI日志报错信息

2014-12-20 16:44:08.769:
[ohasd(6946818)]CRS-2769:Unable to failover resource 'ora.diskmon'.
2014-12-20 16:44:11.775:
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;
Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-20 16:44:26.791:
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;
、Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-20 16:44:41.812:
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;
Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log

从这里可以看出来是由于RAC启动过程中无法获得votedisk使得其无法正常启动,通过分析日志找出来votedisk相关磁盘

2014-12-15 17:36:15.424:
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk4; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-15 17:36:15.433:
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk5; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-15 17:36:15.445:
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk6; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log

从这里可以知道rhdisk4,5,6为votedisk对应磁盘,使用kfed查看磁盘头信息

$kfed read /dev/rhdisk4
kfbh.endian:                        201 ; 0x000: 0xc9
kfbh.hard:                          194 ; 0x001: 0xc2
kfbh.type:                          212 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                        193 ; 0x003: 0xc1
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
1102BEE00 C9C2D4C1 00000000 00000000 00000000  [................]
1102BEE10 00000000 00000000 00000000 00000000  [................]
        Repeat 6 times
1102BEE80 00F9733D 67553E0A 00000000 00000000  [..s=gU>.........]
1102BEE90 00000000 00000000 00000000 00000000  [................]
  Repeat 246 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][212]
$kfed read /dev/rhdisk4 blkn=1
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            2 ; 0x002: KFBTYP_FREESPC
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                       1 ; 0x004: blk=1
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3883664132 ; 0x00c: 0xe77c0304
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdfsb.aunum:                         0 ; 0x000: 0x00000000
kfdfsb.max:                         254 ; 0x004: 0x00fe
kfdfsb.cnt:                          23 ; 0x006: 0x0017
kfdfsb.bound:                         0 ; 0x008: 0x0000
kfdfsb.flag:                          1 ; 0x00a: B=1
kfdfsb.ub1spare:                      0 ; 0x00b: 0x00
kfdfsb.spare[0]:                      0 ; 0x00c: 0x00000000
kfdfsb.spare[1]:                      0 ; 0x010: 0x00000000
kfdfsb.spare[2]:                      0 ; 0x014: 0x00000000
kfdfse[0].fse:                      119 ; 0x018: FREE=0x7 FRAG=0x7
kfdfse[1].fse:                       16 ; 0x019: FREE=0x0 FRAG=0x1
…………
$kfed read /dev/rhdisk4 blkn=510
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     254 ; 0x004: blk=254
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3460116983 ; 0x00c: 0xce3d31f7
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:         ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]:            0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]:            0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        2 ; 0x026: KFDGTP_NORMAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:                CRS_0000 ; 0x028: length=8
kfdhdb.grpname:                     CRS ; 0x048: length=3
kfdhdb.fgname:                 CRS_0000 ; 0x068: length=8
…………

由上述分析可以基本上确定是asm disk header 被破坏,进一步分析破坏原因

[db2/dev#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active
hdisk1          00f9733e21b953e6                    rootvg          active
hdisk2          00f9733e21b97a83                    appvg           active
hdisk3          00f9733e21b98434                    appvg           active
hdisk4          00f9733d67553e0a                    None
hdisk5          00f9733d67553f31                    None
hdisk6          00f9733d67554011                    None
hdisk7          00f9733d67554165                    None
hdisk8          00f9733d675541e5                    None
hdisk9          00f9733d675542e4                    None
hdisk10         none                                None
[db2/dev#]ls -l rhdisk*
crw-------    2 root     system       24,  1 Oct 18 11:45 rhdisk0
crw-------    1 root     system       24,  3 Oct 18 13:27 rhdisk1
crw-------    1 root     system       24,  5 Dec 20 20:02 rhdisk10
crw-------    1 root     system       24,  2 Oct 18 13:32 rhdisk2
crw-------    1 root     system       24,  0 Oct 18 13:32 rhdisk3
crw-rw----    1 grid     asmadmin     24,  8 Dec 20 20:02 rhdisk4
crw-rw----    1 grid     asmadmin     24,  9 Dec 20 20:02 rhdisk5
crw-rw----    1 grid     asmadmin     24, 10 Dec 20 20:02 rhdisk6
crw-rw----    1 grid     asmadmin     24,  4 Dec 20 20:02 rhdisk7
crw-rw----    1 grid     asmadmin     24,  6 Dec 20 20:02 rhdisk8
crw-rw----    1 grid     asmadmin     24,  7 Dec 20 20:02 rhdisk9

从这里基本上可以看出来,是由于磁盘头被重写了pvid,导致asm disk header 被破坏.进一步分析asm log,确定哪些磁盘被用作asm disk

SQL> CREATE DISKGROUP CRS NORMAL REDUNDANCY  DISK '/dev/rhdisk4',
'/dev/rhdisk5',
'/dev/rhdisk6' ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (1,0) to disk (/dev/rhdisk4)
NOTE: Assigning number (1,1) to disk (/dev/rhdisk5)
NOTE: Assigning number (1,2) to disk (/dev/rhdisk6)
NOTE: initializing header on grp 1 disk CRS_0000
NOTE: initializing header on grp 1 disk CRS_0001
NOTE: initializing header on grp 1 disk CRS_0002
SQL> CREATE DISKGROUP DATA EXTERNAL REDUNDANCY  DISK
'/dev/rhdisk9' SIZE 614400M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (2,0) to disk (/dev/rhdisk9)
NOTE: initializing header on grp 2 disk DATA_0000
SQL> CREATE DISKGROUP FBA EXTERNAL REDUNDANCY  DISK
'/dev/rhdisk8' SIZE 204800M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (3,0) to disk (/dev/rhdisk8)
NOTE: initializing header on grp 3 disk FBA_0000
SQL> CREATE DISKGROUP ARCH EXTERNAL REDUNDANCY  DISK
'/dev/rhdisk7' SIZE 102400M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (4,0) to disk (/dev/rhdisk7)
NOTE: initializing header on grp 4 disk ARCH_0000

这里可以确定asm disk为rhdisk[4-9],通过kfed分析全部和rhdisk4一样的问题,也符合lspv查询出来的结果,使用kfed repair修复asm disk header后

SQL> alter diskgroup data mount;
Diskgroup altered.
SQL> alter diskgroup fba mount;
Diskgroup altered.
SQL> alter diskgroup arch mount;
Diskgroup altered.
SQL> alter diskgroup crs mount;
Diskgroup altered.
SQL> select group_number,disk_number,path from v$asm_disk;
GROUP_NUMBER DISK_NUMBER PATH
------------ ----------- --------------------------------------------------
           2           0 /dev/rhdisk4
           2           1 /dev/rhdisk5
           2           2 /dev/rhdisk6
           1           0 /dev/rhdisk7
           4           0 /dev/rhdisk8
           3           0 /dev/rhdisk9
6 rows selected.
SQL> select group_number,name from v$asm_diskgroup;
GROUP_NUMBER NAME
------------ ------------------------------------------------------------
           1 ARCH
           2 CRS
           3 DATA
           4 FBA

这里证明通过kfed对磁盘头的修复,asm磁盘组已经全部mount成功,GI状态也恢复正常

[db2/#]crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.CRS.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.DATA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.FBA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.LISTENER.lsnr
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.asm
               ONLINE  ONLINE       db1                      Started
               ONLINE  ONLINE       db2                      Started
ora.gsd
               OFFLINE OFFLINE      db1
               OFFLINE OFFLINE      db2
ora.net1.network
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.ons
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.registry.acfs
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       db1
ora.cvu
      1        ONLINE  ONLINE       db1
ora.db1.vip
      1        ONLINE  ONLINE       db1
ora.db2.vip
      1        ONLINE  ONLINE       db2
ora.nkora.db
      1        ONLINE  ONLINE       db1                      Open
      2        ONLINE  ONLINE       db2                      Open
ora.oc4j
      1        ONLINE  ONLINE       db1
ora.scan1.vip
      1        ONLINE  ONLINE       db1

这里忽略了一个问题,在修复磁盘头之前没有清除pvid,导致磁盘头修复后,pvid依然存储在odm中

[db2/dev#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active
hdisk1          00f9733e21b953e6                    rootvg          active
hdisk2          00f9733e21b97a83                    appvg           active
hdisk3          00f9733e21b98434                    appvg           active
hdisk4          00f9733d67553e0a                    None
hdisk5          00f9733d67553f31                    None
hdisk6          00f9733d67554011                    None
hdisk7          00f9733d67554165                    None
hdisk8          00f9733d675541e5                    None
hdisk9          00f9733d675542e4                    None
hdisk10         none                                None

通过分析发现fba磁盘组中无任何记录,使用该磁盘组进行直接清除pvid测试

$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Sun Dec 21 03:13:31 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> alter diskgroup fba dismount;
Diskgroup altered.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
$ exit
You have mail in /usr/spool/mail/root
[db2/#]chdev -l hdisk8 -a pv=clear
hdisk8 changed
[db2/#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active
hdisk1          00f9733e21b953e6                    rootvg          active
hdisk2          00f9733e21b97a83                    appvg           active
hdisk3          00f9733e21b98434                    appvg           active
hdisk4          00f9733d67553e0a                    None
hdisk5          00f9733d67553f31                    None
hdisk6          00f9733d67554011                    None
hdisk7          00f9733d67554165                    None
hdisk8          none                                None
hdisk9          00f9733d675542e4                    None
hdisk10         none                                None
[db2/#]su - grid
$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Sun Dec 21 03:15:19 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> alter diskgroup fba mount;
Diskgroup altered.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

通过测试直接清除pvid asm 磁盘头依然工作正常,关闭GI,使用chdev清除hdisk[4-9]所有pvid,启动GI一切正常

[db1/#]crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.CRS.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.DATA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.FBA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.LISTENER.lsnr
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.asm
               ONLINE  ONLINE       db1                      Started
               ONLINE  ONLINE       db2                      Started
ora.gsd
               OFFLINE OFFLINE      db1
               OFFLINE OFFLINE      db2
ora.net1.network
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.ons
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.registry.acfs
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       db1
ora.cvu
      1        ONLINE  ONLINE       db1
ora.db1.vip
      1        ONLINE  ONLINE       db1
ora.db2.vip
      1        ONLINE  ONLINE       db2
ora.nkora.db
      1        ONLINE  ONLINE       db1                      Open
      2        ONLINE  ONLINE       db2                      Open
ora.oc4j
      1        ONLINE  ONLINE       db1
ora.scan1.vip
      1        ONLINE  ONLINE       db1
[db1/#]lspv
hdisk0          00f9733df7c7a9db                    rootvg          active
hdisk1          00f9733d21dad8fe                    rootvg          active
hdisk2          00f9733d21dbd08b                    appvg           active
hdisk3          00f9733d21dbd2ab                    appvg           active
hdisk4          none                                None
hdisk5          none                                None
hdisk6          none                                None
hdisk7          none                                None
hdisk8          none                                None
hdisk9          none                                None
hdisk10         none                                None

至此设置pvid导致asm disk header损坏的asm 恢复正常,实现数据0丢失。
温馨提示:aix asm disk磁盘中不能设置pvid,否则将会导致asm disk header 损坏,无法正常mount
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

hosts无效内容未注释导致RAC安装OUI的Network Interface Usage报INS-41112

安装AIX RAC过程中,在OUI的Network Interface Usage报INS-41112,从而使得安装无法继续
ins-41112
2


错误记录摘要

Cause - Installer has detected that network interface en6 does not maintain connectivity on all cluster nodes.
Action - Ensure that the chosen interface has been configured across all cluster nodes.  Additional Information:
Summary of the failed nodes xifenfei01  
- PRVF-4190 : Verification of the hosts config file failed

hosts解析配置

10.70.89.68     xifenfei01
10.70.89.69     xifenfei01-vip
10.70.89.100    xifenfei01-priv
10.70.89.71     xifenfei02
10.70.89.72     xifenfei02-vip
10.70.89.101    xifenfei02-priv
10.70.89.79     xifenfei-scan

网卡配置

xifenfei01:/u01/soft/grid> ifconfig -a
en7: flags=1e084863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 10.70.89.100 netmask 0xffffffe0 broadcast 10.70.89.127
         tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
en6: flags=1e084863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 10.70.89.68 netmask 0xffffffe0 broadcast 10.70.89.95
         tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
lo0: flags=e08084b,c0<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
        inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
        inet6 ::1%1/0
         tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
xifenfei02/#ifconfig -a
en6: flags=1e084863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 10.70.89.71 netmask 0xffffffe0 broadcast 10.70.89.95
         tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
en7: flags=1e084863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
        inet 10.70.89.101 netmask 0xffffffe0 broadcast 10.70.89.127
         tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
lo0: flags=e08084b,c0<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
        inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
        inet6 ::1%1/0
         tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
xifenfei01/asmdisks#netstat -in
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en7 1500 link#2 0.11.25.bd.b8.7a 107 0 132 2 0
en7 1500 10.70.89.96 10.70.89.100 107 0 132 2 0
en6 1500 link#3 0.11.25.bd.a8.93 50015 0 36963 2 0
en6 1500 10.70.89.64 10.70.89.68 50015 0 36963 2 0
lo0 16896 link#1 1589 0 1588 0 0
lo0 16896 127 127.0.0.1 1589 0 1588 0 0
lo0 16896 ::1%1 1589 0 1588 0 0
xifenfei02/asmdisks#netstat -in
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en6 1500 link#2 0.11.25.bd.a8.a9 5401 0 3660 2 0
en6 1500 10.70.89.64 10.70.89.71 5401 0 3660 2 0
en7 1500 link#3 0.11.25.bd.51.d2 129 0 123 2 0
en7 1500 10.70.89.96 10.70.89.101 129 0 123 2 0
lo0 16896 link#1 1249 0 1249 0 0
lo0 16896 127 127.0.0.1 1249 0 1249 0 0
lo0 16896 ::1%1 1249 0 1249 0 0

这里可以看到网卡和IP配置是匹配,但是两台主机网卡显示顺序不一样.查看EtherChannel配置,也是正常的
4
3


网络ping测试

xifenfei01:/u01/soft/grid> ping xifenfei01-priv
PING xifenfei01-priv: (10.70.89.100): 56 data bytes
64 bytes from 10.70.89.100: icmp_seq=0 ttl=255 time=0 ms
64 bytes from 10.70.89.100: icmp_seq=1 ttl=255 time=0 ms
xifenfei02/#ping xifenfei01-priv
PING xifenfei01-priv: (10.70.89.100): 56 data bytes
64 bytes from 10.70.89.100: icmp_seq=0 ttl=255 time=0 ms
64 bytes from 10.70.89.100: icmp_seq=1 ttl=255 time=0 ms

使用runcluvfy.sh检测网络配置

./runcluvfy.sh comp nodecon -i en7 -n xifenfei01-priv,xifenfei02-priv -verbose
xifenfei01:/u01/soft/grid> ./runcluvfy.sh comp nodecon -i en7 -n xifenfei01-priv,xifenfei02-priv -verbose
Verifying node connectivity
Checking node connectivity...
Checking hosts config file...
  Node Name     Status                    Comment
  ------------  ------------------------  ------------------------
  xifenfei02-priv  passed                    successful
  xifenfei01-priv  failed                    Invalid Entry
ERROR:
PRVF-4190 : Verification of the hosts config file failed
Interface information for node "xifenfei02-priv"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 en6    10.70.89.71     10.70.89.64     10.70.89.71     10.70.89.65     00:11:25:BD:A8:A9 1500
 en7    10.70.89.101    10.70.89.96     10.70.89.101    10.70.89.65     00:11:25:BD:51:D2 1500
Interface information for node "xifenfei01-priv"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 en7    10.70.89.100    10.70.89.96     10.70.89.100    10.70.89.65     00:11:25:BD:B8:7A 1500
 en6    10.70.89.68     10.70.89.64     10.70.89.68     10.70.89.65     00:11:25:BD:A8:93 1500
Check: Node connectivity for interface "en7"
  Source                          Destination                     Connected?
  ------------------------------  ------------------------------  ----------------
  xifenfei02-priv[10.70.89.101]     xifenfei01-priv[10.70.89.100]     yes
Result: Node connectivity passed for interface "en7"
Check: TCP connectivity of subnet "10.70.89.96"
  Source                          Destination                     Connected?
  ------------------------------  ------------------------------  ----------------
  xifenfei01:10.70.89.68            xifenfei02-priv:10.70.89.101      passed
  xifenfei01:10.70.89.68            xifenfei01-priv:10.70.89.100      passed
Result: TCP connectivity check passed for subnet "10.70.89.96"
Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "10.70.89.64".
Subnet mask consistency check passed for subnet "10.70.89.96".
Subnet mask consistency check passed.
Result: Node connectivity check failed
Verification of node connectivity was unsuccessful.
Checks did not pass for the following node(s):
        xifenfei01-priv

这里显示xifenfei01-priv主机检测失败,报错为PRVF-4190,检查xifenfei01主机的hosts文件发现一处错误记录

xifenfei01/#vi /etc/hosts
"/etc/hosts" 113 lines, 3556 characters
# @(#)47        1.1  src/bos/usr/sbin/netstart/hosts, cmdnet, bos530 7/24/91 10:
00:46
# IBM_PROLOG_BEGIN_TAG
# This is an automatically generated prolog.
#
# bos530 src/bos/usr/sbin/netstart/hosts 1.1
#
# Licensed Materials - Property of IBM
#
# (C) COPYRIGHT International Business Machines Corp. 1985,1989
# All Rights Reserved
#
# US Government Users Restricted Rights - Use, duplication or
# disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

这里发现”00:46″是一个新行,而且是无效记录,除掉该行记录,继续runcluvfy.sh测试

xifenfei01:/u01/soft/grid> ./runcluvfy.sh comp nodecon -i en7 -n xifenfei01-priv,xifenfei02-priv -verbose
Verifying node connectivity
Checking node connectivity...
Checking hosts config file...
  Node Name                             Status
  ------------------------------------  ------------------------
  xifenfei02-priv                         passed
  xifenfei01-priv                         passed
Verification of the hosts config file successful
Interface information for node "xifenfei02-priv"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 en6    10.70.89.71     10.70.89.64     10.70.89.71     10.70.89.65     00:11:25:BD:A8:A9 1500
 en7    10.70.89.101    10.70.89.96     10.70.89.101    10.70.89.65     00:11:25:BD:51:D2 1500
Interface information for node "xifenfei01-priv"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 en7    10.70.89.100    10.70.89.96     10.70.89.100    10.70.89.65     00:11:25:BD:B8:7A 1500
 en6    10.70.89.68     10.70.89.64     10.70.89.68     10.70.89.65     00:11:25:BD:A8:93 1500
Check: Node connectivity for interface "en7"
  Source                          Destination                     Connected?
  ------------------------------  ------------------------------  ----------------
  xifenfei02-priv[10.70.89.101]     xifenfei01-priv[10.70.89.100]     yes
Result: Node connectivity passed for interface "en7"
Check: TCP connectivity of subnet "10.70.89.96"
  Source                          Destination                     Connected?
  ------------------------------  ------------------------------  ----------------
  xifenfei01:10.70.89.68            xifenfei02-priv:10.70.89.101      passed
  xifenfei01:10.70.89.68            xifenfei01-priv:10.70.89.100      passed
Result: TCP connectivity check passed for subnet "10.70.89.96"
Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "10.70.89.64".
Subnet mask consistency check passed for subnet "10.70.89.96".
Subnet mask consistency check passed.
Result: Node connectivity check passed
Verification of node connectivity was successful.

除掉无效记录后,runcluvfy检查通过.OUI继续安装一切正常.
果然是由于/etc/hosts中出现无效记录,从而使得RAC安装检查无法通过,再次提醒各位安装RAC需要小心hosts文件
参考文档:PRVF-4190 Verification of the Hosts Config File Failed (Doc ID 1056025.1)
[INS-41112] Specified network interface doesnt maintain connectivity across cluster nodes. (Doc ID 1427202.1)

[MySQL异常恢复]使用工具直接抽取MySQL数据字典

熟悉Oracle的朋友都知道,当Oracle数据库无法正常启动之时,可以通过dul或者其他三方工具直接读取数据文件中数据,从而来抢救数据,减少损失,在mysql中如果使用了innodb引擎也可以在mysql数据库不启动启动下抽取相关记录.本文为抽取数据字典篇章,后续将继续提供drop恢复,truncate 恢复,delete恢复等mysql非常规恢复篇章.
创建一张get_dict测试表

mysql> use xifenfei;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+--------------------+
| Tables_in_xifenfei |
+--------------------+
| t_delete           |
+--------------------+
1 row in set (0.00 sec)
mysql> create table get_dict(id int not null primary key,name varchar(100));
Query OK, 0 rows affected (0.01 sec)
mysql> insert into get_dict value(1,'www.xifenfei.com');
Query OK, 1 row affected (0.00 sec)
mysql> insert into get_dict value(2,'www.xifenfei.com-xifenfei');
Query OK, 1 row affected (0.00 sec)
mysql> insert into get_dict value(3,'xifenfei-www.xifenfei.com');
Query OK, 1 row affected (0.00 sec)
mysql> show tables;
+--------------------+
| Tables_in_xifenfei |
+--------------------+
| get_dict           |
| t_delete           |
+--------------------+
2 rows in set (0.00 sec
mysql> select TABLE_NAME,TABLE_SCHEMA,TABLE_TYPE from information_schema.tables
    -> where table_name='get_dict';
+------------+--------------+------------+
| TABLE_NAME | TABLE_SCHEMA | TABLE_TYPE |
+------------+--------------+------------+
| get_dict   | xifenfei     | BASE TABLE |
+------------+--------------+------------+
1 row in set (0.01 sec)
mysql> select TABLE_NAME,NON_UNIQUE,TABLE_SCHEMA,INDEX_SCHEMA,INDEX_NAME,COLUMN_NAME from
    -> INFORMATION_SCHEMA.STATISTICS where  TABLE_NAME='get_dict';
+------------+------------+--------------+--------------+------------+-------------+
| TABLE_NAME | NON_UNIQUE | TABLE_SCHEMA | INDEX_SCHEMA | INDEX_NAME | COLUMN_NAME |
+------------+------------+--------------+--------------+------------+-------------+
| get_dict   |          0 | xifenfei     | xifenfei     | PRIMARY    | id          |
+------------+------------+--------------+--------------+------------+-------------+
1 row in set (0.00 sec)
mysql> select TABLE_SCHEMA,TABLE_NAME,COLUMN_NAME,ORDINAL_POSITION from
    -> information_schema.COLUMNS  where table_name='get_dict';
+--------------+------------+-------------+------------------+
| TABLE_SCHEMA | TABLE_NAME | COLUMN_NAME | ORDINAL_POSITION |
+--------------+------------+-------------+------------------+
| xifenfei     | get_dict   | id          |                1 |
| xifenfei     | get_dict   | name        |                2 |
+--------------+------------+-------------+------------------+
2 rows in set (0.01 sec)

关闭mysql数据库

[root@localhost recovery_mysql]# service mysql stop
Shutting down MySQL..[  OK  ]
[root@localhost recovery_mysql]# ps -ef|grep mysql
root     18876 15827  0 18:05 pts/1    00:00:00 grep mysql

使用工具解析innodb文件

[root@localhost recovery_mysql]# ./stream_parser -f /var/lib/mysql/ibdata1
Opening file: /var/lib/mysql/ibdata1
File information:
Opening file: /var/lib/mysql/ibdata1
File information:
ID of device containing file:         2054
inode number:                      1782889
ID of device containing file:         2054
protection:                         100660 inode number:                      1782889
(regular file)
protection:                         100660 number of hard links:                    1
(regular file)
user ID of owner:                      101
number of hard links:                    1
group ID of owner:                     102
user ID of owner:                      101
device ID (if special file):             0
group ID of owner:                     102
blocksize for filesystem I/O:         4096
device ID (if special file):             0
number of blocks allocated:          24616
blocksize for filesystem I/O:         4096
Opening file: /var/lib/mysql/ibdata1
number of blocks allocated:          24616
File information:
………………
user ID of owner:                      101
group ID of owner:                     102
device ID (if special file):             0
blocksize for filesystem I/O:         4096
ID of device containing file:         2054
number of blocks allocated:          24616
inode number:                      1782889
protection:                         100660 (regular file)
number of hard links:                    1
user ID of owner:                      101
group ID of owner:                     102
device ID (if special file):             0
blocksize for filesystem I/O:         4096
number of blocks allocated:          24616
time of last access:            1417922668 Sun Dec  7 11:24:28 2014
time of last modification:      1418294104 Thu Dec 11 18:35:04 2014
time of last status change:     1418294104 Thu Dec 11 18:35:04 2014
time of last access:            1417922668 Sun Dec  7 11:24:28 2014
total size, in bytes:             12582912 (12.000 MiB)
time of last modification:      1418294104 Thu Dec 11 18:35:04 2014
time of last status change:     1418294104 Thu Dec 11 18:35:04 2014
Size to process:                  12582912 (12.000 MiB)
total size, in bytes:             12582912 (12.000 MiB)
Size to process:                  12582912 (12.000 MiB)
All workers finished in 0 sec

主要文件介绍

[root@localhost recovery_mysql]# ls -l  pages-ibdata1/FIL_PAGE_INDEX/
total 1388
-rw-r--r-- 1 root root  16384 Dec 11 18:51 0000000000000001.page
-rw-r--r-- 1 root root  16384 Dec 11 18:51 0000000000000002.page
-rw-r--r-- 1 root root  49152 Dec 11 18:51 0000000000000003.page
-rw-r--r-- 1 root root  49152 Dec 11 18:51 0000000000000004.page
-rw-r--r-- 1 root root  16384 Dec 11 18:51 0000000000000005.page
-rw-r--r-- 1 root root 114688 Dec 11 18:51 0000000000000011.page
-rw-r--r-- 1 root root 114688 Dec 11 18:51 0000000000000012.page
-rw-r--r-- 1 root root 114688 Dec 11 18:51 0000000000000013.page
-rw-r--r-- 1 root root 114688 Dec 11 18:51 0000000000000014.page
-rw-r--r-- 1 root root 114688 Dec 11 18:51 0000000000000015.page
-rw-r--r-- 1 root root 147456 Dec 11 18:51 0000000000000016.page
-rw-r--r-- 1 root root  98304 Dec 11 18:51 0000000000000017.page
-rw-r--r-- 1 root root 114688 Dec 11 18:51 0000000000000018.page
-rw-r--r-- 1 root root  49152 Dec 11 18:51 0000000000000019.page
-rw-r--r-- 1 root root  49152 Dec 11 18:51 0000000000000020.page
-rw-r--r-- 1 root root  49152 Dec 11 18:51 0000000000000021.page
-rw-r--r-- 1 root root  65536 Dec 11 18:51 0000000000000025.page
-rw-r--r-- 1 root root  16384 Dec 11 18:51 18446744069414584320.page

0000000000000001.page主要是记录mysql中表信息文件
0000000000000002.page主要是记录mysql中的表的列的信息文件
0000000000000003.page主要是记录mysql中表的index信息文件

抽取table数据

[root@localhost recovery_mysql]# ./c_parser -4f pages-ibdata1/FIL_PAGE_INDEX/0000000000000001.page -t
dictionary/SYS_TABLES.sql > dumps/default/SYS_TABLES 2> dumps/default/SYS_TABLES.sql
[root@localhost recovery_mysql]# grep get dumps/default/SYS_TABLES | head -5
000000000D1D    95000001510110  SYS_TABLES      "xifenfei/get\_dict"    23      2       1       0       80      ""      9
[root@localhost recovery_mysql]# cat dumps/default/SYS_TABLES.sql
SET FOREIGN_KEY_CHECKS=0;
LOAD DATA LOCAL INFILE '/tmp/recovery_mysql/dumps/default/SYS_TABLES' REPLACE INTO TABLE `SYS_TABLES`
FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'SYS_TABLES\t'
 (`NAME`, `ID`, `N_COLS`, `TYPE`, `MIX_ID`, `MIX_LEN`, `CLUSTER_NAME`, `SPACE`);

抽取column数据

[root@localhost recovery_mysql]#  ./c_parser -4f pages-ibdata1/FIL_PAGE_INDEX/0000000000000002.page
 -t dictionary/SYS_COLUMNS.sql > dumps/default/SYS_COLUMNS 2> dumps/default/SYS_COLUMNS.sql
[root@localhost recovery_mysql]# cat dumps/default/SYS_COLUMNS
-- Page id: 10, Format: REDUNDANT, Records list: Valid, Expected records: (115 115)
000000000300    800000012D0123  SYS_COLUMNS     11      0       "ID"    1       4       0       0
000000000300    800000012D0138  SYS_COLUMNS     11      1       "FOR\_NAME"     1       4       0       0
…………
000000000D1D    95000001510129  SYS_COLUMNS     23      0       "id"    6       1283    4       0
000000000D1D    9500000151013E  SYS_COLUMNS     23      1       "name"  1       524303  100     0
-- Page id: 10, Found records: 115, Lost records: NO, Leaf page: YES
[root@localhost recovery_mysql]# more dumps/default/SYS_COLUMNS.sql
SET FOREIGN_KEY_CHECKS=0;
LOAD DATA LOCAL INFILE '/tmp/recovery_mysql/dumps/default/SYS_COLUMNS' REPLACE INTO TABLE
 `SYS_COLUMNS` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY
'"' LINES STARTING BY 'SYS_COLUMNS\t' (`TABLE_ID`, `POS`, `NAME`, `MTYPE`, `PRTYPE`, `LEN`, `PREC`);

抽取index数据

[root@localhost recovery_mysql]# ./c_parser -4f pages-ibdata1/FIL_PAGE_INDEX/0000000000000003.page
 -t dictionary/SYS_INDEXES.sql > dumps/default/SYS_INDEXES 2> dumps/default/SYS_INDEXES.sql
[root@localhost recovery_mysql]# more dumps/default/SYS_INDEXES.sql
SET FOREIGN_KEY_CHECKS=0;
LOAD DATA LOCAL INFILE '/tmp/recovery_mysql/dumps/default/SYS_INDEXES' REPLACE INTO TABLE
 `SYS_INDEXES` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'SYS_INDEXES\t'
(`TABLE_ID`, `ID`, `NAME`, `N_FIELDS`, `TYPE`, `SPACE`, `PAGE_NO`);
[root@localhost recovery_mysql]# more dumps/default/SYS_INDEXES
-- Page id: 11, Format: REDUNDANT, Records list: Valid, Expected records: (13 13)
000000000300    800000012D0177  SYS_INDEXES     11      11      "ID\_IND"       1       3       0       302
…………
000000000B02    820000013504C8  SYS_INDEXES     20      22      "GEN\_CLUST\_INDEX"     0       1       6       3
000000000D1D    9500000151016B  SYS_INDEXES     23      25      "PRIMARY"       1       3       9       3

启动mysql数据库

[root@localhost recovery_mysql]# service mysql start
Starting MySQL..[  OK  ]
[root@localhost recovery_mysql]# ps -ef|grep mysql
root     18948     1  0 19:57 pts/1    00:00:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql
--pid-file=/var/lib/mysql/localhost.localdomain.pid
mysql    19049 18948 14 19:57 pts/1    00:00:00 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql
--plugin-dir=/usr/lib64/mysql/plugin --user=mysql
--log-error=/var/lib/mysql/localhost.localdomain.err --pid-file=/var/lib/mysql/localhost.localdomain.pid
root     19078 15827  0 19:58 pts/1    00:00:00 grep mysql

创建抽取数据字典表

mysql> source dictionary/SYS_TABLES.sql
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
…………
mysql> source dictionary/SYS_INDEXES.sql
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
…………
mysql> source  dictionary/SYS_COLUMNS.sql
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
…………
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| SYS_COLUMNS    |
| SYS_INDEXES    |
| SYS_TABLES     |
+----------------+
3 rows in set (0.00 sec)

加载抽取数据字典数据

mysql> source dumps/default/SYS_TABLES.sql
Query OK, 0 rows affected (0.00 sec)
Query OK, 11 rows affected (0.03 sec)
Records: 11  Deleted: 0  Skipped: 0  Warnings: 0
mysql> source  dumps/default/SYS_INDEXES.sql
Query OK, 0 rows affected (0.00 sec)
Query OK, 39 rows affected (0.01 sec)
Records: 39  Deleted: 0  Skipped: 0  Warnings: 0
mysql> source dumps/default/SYS_COLUMNS.sql
Query OK, 0 rows affected (0.00 sec)
Query OK, 115 rows affected (0.00 sec)
Records: 115  Deleted: 0  Skipped: 0  Warnings: 0

验证抽取数据字典数据

mysql> desc SYS_TABLES
    -> ;
+--------------+---------------------+------+-----+---------+-------+
| Field        | Type                | Null | Key | Default | Extra |
+--------------+---------------------+------+-----+---------+-------+
| NAME         | varchar(255)        | NO   | PRI |         |       |
| ID           | bigint(20) unsigned | NO   |     | 0       |       |
| N_COLS       | int(10)             | YES  |     | NULL    |       |
| TYPE         | int(10) unsigned    | YES  |     | NULL    |       |
| MIX_ID       | bigint(20) unsigned | YES  |     | NULL    |       |
| MIX_LEN      | int(10) unsigned    | YES  |     | NULL    |       |
| CLUSTER_NAME | varchar(255)        | YES  |     | NULL    |       |
| SPACE        | int(10) unsigned    | YES  |     | NULL    |       |
+--------------+---------------------+------+-----+---------+-------+
8 rows in set (0.00 sec)
mysql> SELECT NAME,ID from SYS_TABLES WHERE NAME='xifenfei/get_dict';
+-------------------+----+
| NAME              | ID |
+-------------------+----+
| xifenfei/get_dict | 23 |
+-------------------+----+
1 row in set (0.00 sec)
mysql> desc SYS_COLUMNS
    -> ;
+----------+---------------------+------+-----+---------+-------+
| Field    | Type                | Null | Key | Default | Extra |
+----------+---------------------+------+-----+---------+-------+
| TABLE_ID | bigint(20) unsigned | NO   | PRI | NULL    |       |
| POS      | int(10) unsigned    | NO   | PRI | NULL    |       |
| NAME     | varchar(255)        | YES  |     | NULL    |       |
| MTYPE    | int(10) unsigned    | YES  |     | NULL    |       |
| PRTYPE   | int(10) unsigned    | YES  |     | NULL    |       |
| LEN      | int(10) unsigned    | YES  |     | NULL    |       |
| PREC     | int(10) unsigned    | YES  |     | NULL    |       |
+----------+---------------------+------+-----+---------+-------+
7 rows in set (0.00 sec)
mysql> SELECT TABLE_ID,NAME,MTYPE FROM SYS_COLUMNS WHERE TABLE_ID=23;
+----------+------+-------+
| TABLE_ID | NAME | MTYPE |
+----------+------+-------+
|       23 | id   |     6 |
|       23 | name |     1 |
+----------+------+-------+
2 rows in set (0.01 sec)
mysql> SELECT TABLE_ID,ID,NAME,TYPE FROM SYS_INDEXES WHERE TABLE_ID=23;
+----------+----+---------+------+
| TABLE_ID | ID | NAME    | TYPE |
+----------+----+---------+------+
|       23 | 25 | PRIMARY |    3 |
+----------+----+---------+------+
1 row in set (0.00 sec)

这里基本上可以看出来,在mysql数据库未启动情况下,使用工具可以正常抽取mysql数据字典信息

设置pdb随cdb一起启动

在Oracle 12.1.0.1版本中,在cdb数据库启动过程中,业务pdb无法自动open,如果要实现该功能,需要人工写触发器来实现cdb open后,pdb 给open起来.在12.1.0.2及其以后版本,可以通过设置ALTER PLUGGABLE DATABASE PDB SAVE STATE来实现在cdb open之后业务pdb能够自动open.
数据库启动后pdb未自动open

XFF_CDB$ROOT@SYS> startup
ORACLE 例程已经启动。
Total System Global Area  805306368 bytes
Fixed Size                  3050800 bytes
Variable Size             394265296 bytes
Database Buffers          297795584 bytes
Redo Buffers                5337088 bytes
In-Memory Area            104857600 bytes
数据库装载完毕。
数据库已经打开。
XFF_CDB$ROOT@SYS> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production              0
PL/SQL Release 12.1.0.2.0 - Production                                                    0
CORE    12.1.0.2.0      Production                                                        0
TNS for 64-bit Windows: Version 12.1.0.2.0 - Production                                   0
NLSRTL Version 12.1.0.2.0 - Production                                                    0
XFF_CDB$ROOT@SYS> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            MOUNTED

查询dba_pdb_saved_states无记录

XFF_CDB$ROOT@SYS>  select con_name, state from dba_pdb_saved_states;
未选定行

在数据库mount状态下save state

XFF_CDB$ROOT@SYS>  ALTER PLUGGABLE DATABASE PDB save state;
插接式数据库已变更。
XFF_CDB$ROOT@SYS>  select con_name, state from dba_pdb_saved_states;
未选定行

pdb为mount状态下,执行save state无记录,证明save state不成功

在数据库open状态下save state—-设置pdb随cdb启动

XFF_CDB$ROOT@SYS> ALTER PLUGGABLE DATABASE PDB open;
插接式数据库已变更。
XFF_CDB$ROOT@SYS>  ALTER PLUGGABLE DATABASE PDB save state;
插接式数据库已变更。
XFF_CDB$ROOT@SYS> col con_name for a20
XFF_CDB$ROOT@SYS>  select con_name, state from dba_pdb_saved_states;
CON_NAME             STATE
-------------------- --------------
PDB                  OPEN

pdb为open状态下,执save state成功.
需要注意save state需要在pdb open情况下执行才能够生效.

重启数据库测试pdb随cdb启动

XFF_CDB$ROOT@SYS> shutdown immediate
数据库已经关闭。
已经卸载数据库。
ORACLE 例程已经关闭。
XFF_CDB$ROOT@SYS> startup
ORACLE 例程已经启动。
Total System Global Area  805306368 bytes
Fixed Size                  3050800 bytes
Variable Size             364905168 bytes
Database Buffers          327155712 bytes
Redo Buffers                5337088 bytes
In-Memory Area            104857600 bytes
数据库装载完毕。
数据库已经打开。
XFF_CDB$ROOT@SYS> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            READ WRITE NO

禁用pdb随cdb启动—DISCARD STATE

XFF_CDB$ROOT@SYS>  ALTER PLUGGABLE DATABASE PDB DISCARD  state;
插接式数据库已变更。
XFF_CDB$ROOT@SYS>  select con_name, state from dba_pdb_saved_states;
未选定行
XFF_CDB$ROOT@SYS> shutdown immediate;
数据库已经关闭。
已经卸载数据库。
ORACLE 例程已经关闭。
XFF_CDB$ROOT@SYS> startup
ORACLE 例程已经启动。
Total System Global Area  805306368 bytes
Fixed Size                  3050800 bytes
Variable Size             364905168 bytes
Database Buffers          327155712 bytes
Redo Buffers                5337088 bytes
In-Memory Area            104857600 bytes
数据库装载完毕。
数据库已经打开。
XFF_CDB$ROOT@SYS> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            MOUNTED
XFF_CDB$ROOT@SYS>

12.1.0.1中设置pdb随cdb启动

CREATE TRIGGER open_all_pdbs
   AFTER STARTUP
   ON DATABASE
BEGIN
   EXECUTE IMMEDIATE 'alter pluggable database all open';
END open_all_pdbs;
/

SYSTEM表空间坏块恢复—C_TS#对象坏块恢复(file 1 block 60)

一朋友给我电话,说他们客户公司数据库故障,被另外一家公司恢复了一天不能正常恢复,请求我协助解决.接手一看数据库已经被破坏的不像样子了,根据alert日志信息大概分析了故障原因和上家公司处理情况。后面接手后通过bbed修复block数据库恢复过程,在本次恢复中出现大量ORA-600错误,主要包括ORA-00600 400,ORA-00600 2662,ORA-00600 2663,ORA-00600 krhpfh_03-1209,ORA-00600 3600,ORA-00600 ktsitbs_info1,ORA-00600 4137,ORA-00600 4511,ORA-00600 4198,ORA-00600 6807等
故障原因redo文件丢失

Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
LGWR: terminating instance due to error 313
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_pmon_1394.trc:
ORA-00313: open failed for members of log group  of thread
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_ckpt_1406.trc:
ORA-00313: open failed for members of log group  of thread
Instance terminated by LGWR, pid = 1404

尝试clear redo文件方式恢复

Thu Nov 20 13:04:16 2014
alter database clear logfile group 9
Thu Nov 20 13:04:16 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:45 2014
alter database clear logfile group 9
Thu Nov 20 13:04:46 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:59 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:04:59 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...
Thu Nov 20 13:05:00 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:05:00 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...

不完全恢复resetlogs尝试打开数据库

ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile  ...
Thu Nov 20 13:49:01 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER CANCEL
Thu Nov 20 13:49:03 2014
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL
Thu Nov 20 13:49:33 2014
alter database open resetlogs
Thu Nov 20 13:49:34 2014
ORA-1113 signalled during: alter database open resetlogs...

使用隐含参数

_allow_resetlogs_corruption= TRUE

进行不完全恢复,尝试open数据库报ORA-600 4000错误

Thu Nov 20 14:35:02 2014
ALTER DATABASE   MOUNT
Thu Nov 20 14:35:07 2014
Setting recovery target incarnation to 2
Thu Nov 20 14:35:07 2014
Successful mount of redo thread 1, with mount id 4039504598
Thu Nov 20 14:35:07 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 14:40:33 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 14:40:33 2014
Media Recovery Start
Thu Nov 20 14:40:33 2014
Media Recovery failed with error 1610
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 14:41:23 2014
ALTER DATABASE RECOVER  database using backup controlfile until cancel
Thu Nov 20 14:43:08 2014
alter database open resetlogs
Thu Nov 20 14:43:08 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31293973571
Resetting resetlogs activation ID 3855216310 (0xe5c9eeb6)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Online log /data2/oradata/redo0902.log: Thread 1 Group 9 was previously cleared
Thu Nov 20 14:43:14 2014
Setting recovery target incarnation to 3
Thu Nov 20 14:43:14 2014
Assigning activation ID 4039504598 (0xf0c5f2d6)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 14:43:14 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 14:43:14 2014
SMON: enabling cache recovery
Thu Nov 20 14:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1844
ORA-1092 signalled during: alter database open resetlogs...

尝试隐含屏蔽回滚段

_corrupted_rollback_segments= _SYSSMU1$, _SYSSMU2$,…………

错误依旧ORA-600 4000

Thu Nov 20 15:09:21 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 15:09:21 2014
SMON: enabling cache recovery
Thu Nov 20 15:09:21 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 624
ORA-1092 signalled during: alter database open

多次重启,resetlogs后,数据库出现ORA-600 2662错误

Successful open of redo thread 1
Thu Nov 20 17:13:24 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 17:13:24 2014
SMON: enabling cache recovery
Thu Nov 20 17:13:24 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 7967
ORA-1092 signalled during: ALTER DATABASE OPEN...
Thu Nov 20 17:18:23 2014
USER: terminating instance due to error 1092
Instance terminated by USER, pid = 7967

offline undo相关文件,尝试打开数据库

Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 17:52:31 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 17:52:31 2014
Media Recovery Start
 parallel recovery started with 15 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 17:53:42 2014
ALTER DATABASE RECOVER CANCEL
Thu Nov 20 17:53:44 2014
ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL ...
Thu Nov 20 17:56:34 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:56:35 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:57:01 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:02 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:26 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:27 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:43 2014
alter database open resetlogs
Thu Nov 20 17:57:43 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:58:58 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:58:58 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:59:15 2014
alter database open resetlogs
Thu Nov 20 17:59:15 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:59:35 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:35 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:50 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 17:59:50 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 18:00:07 2014
alter database open resetlogs
Thu Nov 20 18:00:07 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31294173628
Resetting resetlogs activation ID 4039492628 (0xf0c5c414)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Thu Nov 20 18:00:14 2014
Setting recovery target incarnation to 8
Thu Nov 20 18:00:14 2014
Assigning activation ID 4039504142 (0xf0c5f10e)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 18:00:15 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 18:00:15 2014
SMON: enabling cache recovery
Thu Nov 20 18:00:15 2014
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
File #2 is offline, but is part of an online tablespace.
data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
File #100 is offline, but is part of an online tablespace.
data file 100: '/data2/oradata/undotbs02.dbf'
Thu Nov 20 18:00:28 2014
File #185 is offline, but is part of an online tablespace.
data file 185: '/data2/oradata/undotbs03.dbf'
Dictionary check complete
Thu Nov 20 18:00:35 2014
SMON: enabling tx recovery
Thu Nov 20 18:00:36 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_28472.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_28450.trc:
ORA-00604: error occurred at recursive SQL level
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_28446.trc:
ORA-00604: error occurred at recursive SQL level
Instance terminated by USER, pid = 28472
ORA-1092 signalled during: alter database open resetlogs...

不知道做了什么操作出现file 1 block 60坏块,很可能bbed修改错误导致

Thu Nov 20 19:18:15 2014
SMON: enabling cache recovery
Thu Nov 20 19:18:16 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Successfully onlined Undo Tablespace 1.
Thu Nov 20 19:18:16 2014
SMON: enabling tx recovery
Thu Nov 20 19:18:17 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 19:18:17 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 13232
ORA-1092 signalled during: alter database open...

尝试不完全恢复,并resetlogs操作

ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 19:33:41 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:33:41 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:33:42 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:34:06 2014
alter database open resetlogs
Thu Nov 20 19:34:06 2014
ORA-1139 signalled during: alter database open resetlogs...
Thu Nov 20 19:34:17 2014
alter database open
Thu Nov 20 19:34:17 2014
ORA-1190 signalled during: alter database open...
Thu Nov 20 19:35:57 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 19:35:57 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:35:58 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:35:59 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:37:19 2014
alter database open resetlogs
Thu Nov 20 19:37:19 2014
ORA-1139 signalled during: alter database open resetlogs...

继续打开报 ORA-600 3600错误

Thu Nov 20 19:43:14 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 19:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
DBW0: terminating instance due to error 471
Instance terminated by DBW0, pid = 20856

<strong>中间多次重启和resetlogs,还出现ORA-600 2663错误</strong>

Fri Nov 21 12:35:12 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Nov 21 12:35:12 2014
SMON: enabling cache recovery
Fri Nov 21 12:35:13 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mman_15572.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw1_15576.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 15596
ORA-1092 signalled during: ALTER DATABASE OPEN..

继续尝试打开数据库出现ORA-600 ktsitbs_info1错误

SMON: enabling cache recovery
Fri Nov 21 13:54:25 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Fri Nov 21 13:54:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Fri Nov 21 13:54:27 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 21111
ORA-1092 signalled during: alter database open...

以上是客户数据库故障原因和问题大概的处理过程,下面是我接手后的处理过程


dbv 检查system01.dbf文件,得到结果

HNDX-DB% dbv file=/opt/oracle/oradata/xifenfei/system01.dbf
DBVERIFY: Release 10.2.0.1.0 - Production on Fri Nov 21 16:22:37 2014
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /opt/oracle/oradata/xifenfei/system01.dbf
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv:
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720
DBVERIFY - Verification complete
Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 2
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 2
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

这里知道数据库有两个坏块,而且根据对于bootstrap$的经验,可以大概确定60坏块很可能是C_TS#,第一反应type异常,498可能是seq$

对数据库启动过程做10046,得到trace文件

PARSING IN CURSOR #1 len=275 dep=2 uid=0 oct=3 lid=0 tim=27978051403575 hv=3408408745 ad='7df93cd0'
select name,online$,contents$,undofile#,undoblock#,blocksize,dflmaxext,dflinit,dflincr,dflextpct,dflminext,
dflminlen, owner#,scnwrp,scnbas, NVL(pitrscnwrp, 0), NVL(pitrscnbas, 0), dflogging, bitmapped, inc#, flags,
plugged, NVL(spare1,0), NVL(spare2,0) from ts$ where ts#=:1
END OF STMT
PARSE #1:c=0,e=92,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051403569
BINDS #1:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=ffffffff7dbac9a8  bln=22  avl=02  flg=05
  value=2
EXEC #1:c=0,e=310,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051404296
WAIT #1: nam='db file sequential read' ela= 42 file#=1 block#=60 blocks=1 obj#=-1 tim=27978051404449
Hex dump of (file 1, block 60)
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
FETCH #1:c=10000,e=4072,p=1,cr=2,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051408438
STAT #1 id=1 cnt=0 pid=0 pos=1 obj=16 op='TABLE ACCESS CLUSTER TS$ (cr=2 pr=1 pw=0 time=4075 us)'
STAT #1 id=2 cnt=1 pid=1 pos=1 obj=7 op='INDEX UNIQUE SCAN I_TS# (cr=1 pr=0 pw=0 time=13 us)'
*** 2014-11-22 14:44:43.235
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Current SQL statement for this session:
select max(maxconcurrency) from sys.wrh$_undostat  where instance_number = :1 and dbid = :2
and snap_id in   (select snap_id from dba_hist_snapshot where end_interval_time >
(select max(end_interval_time)-7 from dba_hist_snapshot))

这里显示了数据库启动报ORA-00600[ktsitbs_info1],[2],明显的表示了b中的2是表示表空间号,由于ts$坏块,无法读取ts$中表空间信息,从而出现数据字典不一致,从而出现该错误。所以恢复该库的关键是修复file 1 block 60.

bbed尝试修复file 1 block 60

HNDX-DB% bbed password=blockedit mode=edit
BBED: Release 2.0.0.0.0 - Limited Production on Sat Nov 22 15:16:26 2014
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename '/opt/oracle/oradata/xifenfei/system01.dbf'
        FILENAME        /opt/oracle/oradata/xifenfei/system01.dbf
BBED> set block 8192
        BLOCK#          8192
BBED> set block 60
        BLOCK#          60
BBED> set count 64
        COUNT           64
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (128)
BBED> set block 61
        BLOCK#          61
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 72 bytes                     @20
 struct kdbh, 14 bytes                      @92
 struct kdbt[3], 12 bytes                   @106
 sb2 kdbr[2]                                @118
 ub1 freespace[7959]                        @122
 ub1 rowdata[107]                           @8081
 ub4 tailchk                                @8188
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x0040003d
   ub4 bas_kcbh                             @8        0x0000235b
   ub2 wrp_kcbh                             @12       0x0000
   ub1 seq_kcbh                             @14       0x01
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x7a85
   ub2 spare3_kcbh                          @18       0x0000
BBED> set block 60
        BLOCK#          60
BBED> d
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 80000000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000
 <32 bytes per line>
BBED> d block 61
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003d 0000235b 00000104 7a850000 01000000 00000006 00001837
 00001738 00020200 00000000 0007002e 00000002 00800075 00012300 80000000
 <32 bytes per line>
BBED> set block 60
        BLOCK#          60
BBED> m /x 06a2
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000
 <32 bytes per line>
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 72 bytes                     @20
 struct kdbh, 14 bytes                      @92
 struct kdbt[3], 12 bytes                   @106
 sb2 kdbr[2]                                @118
 ub1 freespace[7598]                        @122
 ub1 rowdata[468]                           @7720
 ub4 tailchk                                @8188
BBED> sum apply
Check value for File 0, Block 60:
current = 0xe908, required = 0xe908
BBED> verify
DBVERIFY - Verification starting
FILE = /opt/oracle/oradata/xifenfei/system01.dbf
BLOCK = 60
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
BBED>

尝试启动数据库

Sat Nov 22 15:51:33 2014
alter database open
Sat Nov 22 15:51:34 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 15:51:34 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 15:51:34 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 15:51:34 2014
Database Characterset is ZHS16CGB231280
Hex dump of (file 1, block 498) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_2818.trc
Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720
Reread of rdba: 0x004001f2 (file 1, block 498) found same corrupted data
Sat Nov 22 15:51:35 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_2803.trc:
ORA-00600: internal error code, arguments: [4000], [12], [], [], [], [], [], []
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=18, OS id=3000
Sat Nov 22 15:51:36 2014
Completed: alter database open
Sat Nov 22 15:51:36 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_3010.trc:
ORA-00600: internal error code, arguments: [6807], [AUDSES$], [144], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [6807], [WRI$_ALERT_SEQUENCE], [8783], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Non-fatal internal error happenned while SMON was doing non-existent object cleanup.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Sat Nov 22 15:51:38 2014
ORA-600 encountered when generating server alert SMG-3000
Sat Nov 22 15:51:38 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [ktcpoptx_0], [0x772705E60], [], [], [], [], [], []

只要出现ORA-600 4000和ORA-600 6807错误,其中ORA-600 6807错误比较明显是由于seq$坏块,导致AUDSES$ seq异常导致。ORA-600 4000应该是回滚段异常,继续分析回滚段

SQL> select name,ts#,status$ from undo$;
NAME                                  TS#    STATUS$
------------------------------ ---------- ----------
SYSTEM                                  0          2
_SYSSMU1$                               1          2
_SYSSMU2$                               1          2
_SYSSMU3$                               1          2
…………
_SYSSMU168$                             1          2
_SYSSMU169$                             1          2

这里很异常,system回滚段在数据库open之后,按照常理不可能处于STATUS$=2(OFFLINE)状态。而且其他回滚段全部为OFFLINE状态也属于异常情况.而且尝试drop undo报ORA-01561,另外在dba_rollback_segs中无SYSTEM(查询结果忘记保存)

SQL> drop tablespace undotbs1 including contents;
drop tablespace undotbs1 including contents
*
ERROR at line 1:
ORA-01561: failed to remove all objects in the tablespace specified

通过这一系列很怀疑是由于bbed 修改了undo$等相关基表信息导致现在system中的undo信息混乱.信息反馈给客户后,客户想起来昨天给他们恢复的公司在bbed操作前备份了system01.dbf.突然感觉救星来了.实在怕不懂bbed的人折腾bbed

dbv检测备份文件

DBVERIFY - Verification starting : FILE = /data3/backup/system01.dbf_bak
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv:
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Block Checking: DBA = 4194802, Block Type = KTB-managed data block
data header at 0x1002ef05c
kdbchk: row locked by non-existent transaction
        table=0   slot=4
        lockid=1   ktbbhitc=2
Page 498 failed with check code 6101
DBVERIFY - Verification complete
Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 1
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

好家伙只有一个物理坏块和一个逻辑坏块,而对于物理坏块block 60已经知道如何修复,逻辑坏块可以尝试设置隐含参数跳过去,bbed修改相关block(同上步骤)

再次启动数据库

dd if=/opt/oracle/oradata/xifenfei/system01.dbf bs=8192 count=2 of=/tmp/system01.2
dd if=/tmp/system01.2 of=/data3/backup/system01.dbf_bak bs=8192 count=2 conv=notrunc
Sat Nov 22 17:52:50 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Sat Nov 22 17:53:38 2014
alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:53:39 2014
Completed: alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:55:43 2014
alter database open
Sat Nov 22 17:55:48 2014
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=18, OS id=15858
Sat Nov 22 17:56:10 2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=17, OS id=15879
Sat Nov 22 17:56:19 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 17:56:19 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 17:56:19 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 17:56:20 2014
ARC1: STARTING ARCH PROCESSES
Sat Nov 22 17:56:20 2014
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Sat Nov 22 17:56:22 2014
Database Characterset is ZHS16CGB231280
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Nov 22 17:56:33 2014
ARC2: Archival started
ARC1: STARTING ARCH PROCESSES COMPLETE
ARC1: Becoming the heartbeat ARCH
ARC2 started with pid=23, OS id=15928
QMNC started with pid=25, OS id=15996
Sat Nov 22 17:57:11 2014
Completed: alter database open
Sat Nov 22 17:57:18 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16010.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:57:26 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16012.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:58:17 2014
Starting background process EMN0
Sat Nov 22 18:00:03 2014
Shutting down instance: further logons disabled
EMN0 started with pid=71, OS id=16421
Sat Nov 22 18:00:12 2014
SMON: Restarting fast_start parallel rollback
Sat Nov 22 18:00:23 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_p000_15951.trc:
ORA-00600: internal error code, arguments: [4198], [9], [], [], [], [], [], []
Sat Nov 22 18:00:24 2014
Stopping background process CJQ0
Sat Nov 22 18:00:24 2014
Stopping background process QMNC
Sat Nov 22 18:00:27 2014
Doing block recovery for file 2 block 41
Block recovery from logseq 7, block 180883 to scn 214748389244
Sat Nov 22 18:00:27 2014
Recovery of Online Redo Log: Thread 1 Group 8 Seq 7 Reading mem 0
  Mem# 0 errs 0: /data2/oradata/redo0802.log
Block recovery stopped at EOT rba 7.180988.16
Block recovery completed at rba 7.180988.16, scn 50.24441
Sat Nov 22 18:00:32 2014
Stopping background process MMNL
Sat Nov 22 18:00:38 2014
Stopping background process MMON
Sat Nov 22 18:00:41 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sat Nov 22 18:00:42 2014
ORACLE Instance xifenfei (pid = 9) - Error 600 encountered while recovering transaction (3, 4).
Sat Nov 22 18:00:42 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []

这里都是很常规的错误,查询undo$也已经正常,重建新undo表空间删除老undo,然后alert日志中无其他报错,数据库恢复至此完成,建议客户导出导入重建数据库

对于IN Memory Option 部分细节测试—主要当inmemory_size不足之时

本文对于IMDB的几个特性进行了具体测试:
1. 压缩级别和压缩率(具体也需要具体测试),本实验仅提供参考
2. 对于IM空间不足已经存在的对象和加入新对象的现象
3. 对于PRIORITY级别进行了简单测试
数据库基本配置信息

SQL> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production              0
PL/SQL Release 12.1.0.2.0 - Production                                                    0
CORE    12.1.0.2.0      Production                                                        0
TNS for 64-bit Windows: Version 12.1.0.2.0 - Production                                   0
NLSRTL Version 12.1.0.2.0 - Production                                                    0
SQL> show parameter inmemory;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default              string
inmemory_force                       string      DEFAULT
inmemory_max_populate_servers        integer     2
inmemory_query                       string      ENABLE
inmemory_size                        big integer 100M
inmemory_trickle_repopulate_servers_ integer     1
percent
optimizer_inmemory_aware             boolean     TRUE

数据库版本12.1.0.2,inmemory_size配置为100M

准备测试环境

SQL> create tablespace inmemory datafile 'D:\APP\FFCHENG\ORADATA\XFF\PDB\in_memory01.dbf'
   2  size 100m autoextend on next 4m maxsize 10g;
表空间已创建。
SQL> create user chf identified by xifenfei;
用户已创建。
SQL> grant dba to chf;
授权成功。
SQL> alter user chf default tablespace inmemory;
用户已更改。
SQL> create table chf.t_inmemory1 as select * from dba_objects;
表已创建。

创建测试表空间,用户,测试表

测试压缩级别

SQL> alter table chf.t_inmemory1 inmemory NO MEMCOMPRESS;
表已更改。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
未选定行
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488      10616832 NONE     NO MEMCOMPRESS
--NO MEMCOMPRESS 压缩比例非常小,基本上不压缩
SQL> alter table chf.t_inmemory1 no inmemory ;
表已更改。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
未选定行
SQL>  alter table chf.t_inmemory1 inmemory MEMCOMPRESS FOR DML
  2  ;
表已更改。
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488      10616832 NONE     FOR DML
--FOR DML 压缩比例非常小,基本上不压缩和NO MEMCOMPRESS在压缩效果上类似
SQL> alter table chf.t_inmemory1 no inmemory ;
表已更改。
SQL> alter table chf.t_inmemory1 inmemory MEMCOMPRESS FOR QUERY LOW;
表已更改。
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488       4325376 NONE     FOR QUERY LOW
-- FOR QUERY LOW为默认压缩级别,这里看压缩比例在3:1左右,具体取决于数据
SQL> alter table chf.t_inmemory1 no inmemory ;
表已更改。
SQL> alter table chf.t_inmemory1 inmemory MEMCOMPRESS FOR QUERY HIGH;
表已更改。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
未选定行
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488       3276800 NONE     FOR QUERY HIGH
-- FOR QUERY HIGH,这里看压缩比例在4:1左右,具体取决于数据
SQL> alter table chf.t_inmemory1 no inmemory ;
表已更改。
SQL> alter table chf.t_inmemory1 inmemory MEMCOMPRESS FOR CAPACITY LOW;
表已更改。
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488       2228224 NONE     FOR CAPACITY LOW
-- FOR CAPACITY LOW,这里看压缩比例在6:1左右,具体取决于数据
SQL> alter table chf.t_inmemory1 no inmemory ;
表已更改。
SQL> alter table chf.t_inmemory1 inmemory MEMCOMPRESS FOR CAPACITY HIGH;
表已更改。
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488        131072 NONE     FOR CAPACITY HIGH
-- FOR CAPACITY HIGH,这里看压缩比例在10:1左右,具体取决于数据

这里可以看出来,压缩效果确实如Oracle所描述,级别越高压缩效果越好.

测试inmemory_size大小不足之时

SQL> alter table chf.t_inmemory1 inmemory no MEMCOMPRESS;
表已更改。
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
     91040
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  13631488      10616832 NONE     NO MEMCOMPRESS
--dml插入数据,不再次查询数据,v$im_segments.inmemory_size不发生改变(这个是bug还是设计考虑??)
SQL> insert into chf.t_inmemory1 select * from chf.t_inmemory1;
已创建 91040 行。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  26214400      10616832 NONE     NO MEMCOMPRESS
SQL> commit;
提交完成。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  26214400      10616832 NONE     NO MEMCOMPRESS
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
    182080
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  26214400      22282240 NONE     NO MEMCOMPRESS
SQL> insert into chf.t_inmemory1 select * from chf.t_inmemory1;
已创建 182080 行。
SQL> commit;
提交完成。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  51380224      22282240 NONE     NO MEMCOMPRESS
--通过10046证明,虽然v$im_segments.inmemory_size值未及时更新,但是IMDB是生效的
SQL> oradebug setmypid
已处理的语句
SQL> alter session set db_file_multiblocK_read_count=1;
会话已更改。
SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12
已处理的语句
SQL> oradebug TRACEFILE_NAME
D:\APP\FFCHENG\diag\rdbms\xff\xff\trace\xff_ora_7604.trc
SQL> SELECT COUNT(object_id) FROM chf.t_inmemory1;
COUNT(OBJECT_ID)
----------------
          364156
SQL> oradebug EVENT 10046 trace name context off
已处理的语句
PARSING IN CURSOR #455134016 len=44 dep=0 uid=0 oct=3 lid=0 tim=126773093621 hv=1133975269
 ad='7ff07339500' sqlid='5909ukj1tf5r5'
SELECT COUNT(object_id) FROM chf.t_inmemory1
END OF STMT
PARSE #455134016:c=15600,e=3912,p=0,cr=2,cu=0,mis=1,r=0,dep=0,og=1,plh=3154396630,tim=126773093620
WAIT #455134016: nam='Disk file operations I/O' ela= 154 FileOperation=8 fileno=0 filetype=8 obj#=-1 tim=126773093926
EXEC #455134016:c=0,e=25,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3154396630,tim=126773094005
WAIT #455134016: nam='SQL*Net message to client' ela= 2 driver id=1111838976 #bytes=1 p3=0 obj#=-1 tim=126773094044
FETCH #455134016:c=0,e=13751,p=0,cr=3110,cu=1,mis=0,r=1,dep=0,og=1,plh=3154396630,tim=126773107829
STAT #455134016 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=3110 pr=0 pw=0 time=13751 us)'
STAT #455134016 id=2 cnt=364160 pid=1 pos=1 obj=91914 op='TABLE ACCESS INMEMORY FULL T_INMEMORY1 (cr=3110 pr=0
  pw=0 time=5386 us cost=17 size=455200 card=91040)'
WAIT #455134016: nam='SQL*Net message from client' ela= 116 driver id=1111838976 #bytes=1 p3=0 obj#=-1 tim=126773108164
FETCH #455134016:c=0,e=1,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,plh=3154396630,tim=126773108215
WAIT #455134016: nam='SQL*Net message to client' ela= 0 driver id=1111838976 #bytes=1 p3=0 obj#=-1 tim=126773108246
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  51380224      43384832 NONE     NO MEMCOMPRESS
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
    364160
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
  51380224      43384832 NONE     NO MEMCOMPRESS
SQL> insert into chf.t_inmemory1 select * from chf.t_inmemory1;
已创建 364160 行。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 109051904      43384832 NONE     NO MEMCOMPRESS
SQL> commit;
提交完成。
SQL> SELECT COUNT(*) FROM chf.t_inmemory1;
  COUNT(*)
----------
    728320
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 109051904      43384832 NONE     NO MEMCOMPRESS
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 109051904      71892992 NONE     NO MEMCOMPRESS
--这里可以看出来INMEMORY_SIZE已经使用了71892992,再插入一次数据,一共100M的IM肯定不够使用
SQL> insert into chf.t_inmemory1 select * from chf.t_inmemory1;
已创建 728320 行。
SQL> commit;
提交完成。
SQL> select count(object_id) from chf.t_inmemory1;
COUNT(OBJECT_ID)
----------------
         1456624
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 201326592      63438848 NONE     NO MEMCOMPRESS
--这里现在的INMEMORY_SIZE变为了63438848小于在插入数据之前的71892992,证明IM肯定出现问题,比如已经满了,
  v$im_segments显示值不准确
--测试刷新buffer_cache对IM的影响
SQL> alter system flush buffer_cache;
系统已更改。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 201326592      63438848 NONE     NO MEMCOMPRESS
--结果证明无影响
autotrace结果
SQL> set autot trace exp stat
SQL> set lines 120
SQL> pages 1000
SQL> set pages 1000
SQL> select count(*) from chf.t_inmemory1;
执行计划
----------------------------------------------------------
Plan hash value: 3154396630
-----------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    16   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |             |     1 |            |          |
|   2 |   TABLE ACCESS INMEMORY FULL| T_INMEMORY1 | 91040 |    16   (0)| 00:00:01 |
-----------------------------------------------------------------------------------
统计信息
----------------------------------------------------------
          5  recursive calls
          0  db block gets
      16693  consistent gets
      16690  physical reads
          0  redo size
        546  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
10046结果
SQL ID: 1b61dgunxftdx Plan Hash: 3154396630
select count(object_id)
from
 chf.t_inmemory1
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          2          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      1.26       4.14      16689      22446          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      1.26       4.14      16689      22448          0           1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Number of plan statistics captured: 1
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=22446 pr=16689 pw=0 time=4144536 us)
   1456640    1456640    1456640   TABLE ACCESS INMEMORY FULL T_INMEMORY1 (cr=22446 pr=16689 pw=0
                                         time=2560999 us cost=17 size=455200 card=91040)
Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                        1        0.00          0.00
  SQL*Net message to client                       2        0.00          0.00
  db file sequential read                     16689        0.03          3.05
  SQL*Net message from client                     2        5.40          5.40
--autotrace和10046都证明,当IM size不足之时,数据库未能够使用IM的特性,哪怕是部分也不能使用
--创建新对象存放IM中
SQL> create table chf.t_inmemory2 as select * from dba_objects;
表已创建。
SQL> alter table chf.t_inmemory2 inmemory;
表已更改。
SQL> select count(*) from chf.t_inmemory2;
  COUNT(*)
----------
     91041
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
  2  from v$im_segments where segment_name = 'T_INMEMORY1';
SEGMENT_NAME
--------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 201326592      63438848 NONE     NO MEMCOMPRESS
SQL> select count(*) from chf.t_inmemory2;
执行计划
----------------------------------------------------------
Plan hash value: 2042227318
-----------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    16   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |             |     1 |            |          |
|   2 |   TABLE ACCESS INMEMORY FULL| T_INMEMORY2 | 91041 |    16   (0)| 00:00:01 |
-----------------------------------------------------------------------------------
统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1532  consistent gets
       1530  physical reads
          0  redo size
        545  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
SQL> alter table chf.t_inmemory1 no inmemory;
表已更改。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
   2  from v$im_segments where segment_name = 'T_INMEMORY2';
未选定行
SQL> set autot traceonly exp stat
SQL> select count(*) from chf.t_inmemory2;
执行计划
----------------------------------------------------------
Plan hash value: 2042227318
-----------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    16   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |             |     1 |            |          |
|   2 |   TABLE ACCESS INMEMORY FULL| T_INMEMORY2 | 91041 |    16   (0)| 00:00:01 |
-----------------------------------------------------------------------------------
统计信息
----------------------------------------------------------
         57  recursive calls
          0  db block gets
       1565  consistent gets
       1532  physical reads
          0  redo size
        545  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
          1  rows processed
SQL> set autot off
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION
   2  from v$im_segments where segment_name = 'T_INMEMORY2';
SEGMENT_NAME
-----------------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY2
  13631488       4325376 NONE     FOR QUERY LOW
SQL> set autot traceonly exp stat
SQL> select count(*) from chf.t_inmemory2;
执行计划
----------------------------------------------------------
Plan hash value: 2042227318
-----------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    53   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |             |     1 |            |          |
|   2 |   TABLE ACCESS INMEMORY FULL| T_INMEMORY2 | 91041 |    53   (0)| 00:00:01 |
-----------------------------------------------------------------------------------
统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        545  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
--当IM已经无空闲空间之时,创建新对象在PRIORITY未提升之前,即便是设置了IM和对对象进行了访问,也无法存入IM

上述测试几个结论:
1. 随着IM中对象的增加,当INMEMORY_SIZE不足之时,v$im_segments.INMEMORY_SIZE显示不准确
2. 随着IM中对象的增加,当INMEMORY_SIZE不足之时,当IM中的对象不能全部在IM中之时,对其对象操作,会转换成传统数据库操作,
不会使用部分的IM特性,但是执行计划依然提示使用INMEMORY
3. flush buffer_cache 不影响对象的IM
4. 当IM已经无空闲空间之时,创建新对象在PRIORITY未提升之前,即便是设置了IM和对对象进行了访问,也无法存入IM,
访问依然是传统方式,但是执行计划是INMEMORY

测试PRIORITY

SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION from v$im_segments;
SEGMENT_NAME
------------------------------------------------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 201326592      57999360 NONE     NO MEMCOMPRESS
SQL> alter table chf.t_inmemory2  inmemory;
表已更改。
SQL> select count(*) from chf.t_inmemory2;
  COUNT(*)
----------
     91041
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION from v$im_segments;
SEGMENT_NAME
------------------------------------------------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY1
 201326592      57999360 NONE     NO MEMCOMPRESS
SQL> alter table chf.t_inmemory1  inmemory no memcompress PRIORITY LOW;
表已更改。
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION from v$im_segments;
未选定行
SQL> select count(*) from chf.t_inmemory2;
  COUNT(*)
----------
     91041
SQL>  select segment_name,bytes,inmemory_size,INMEMORY_PRIORITY,INMEMORY_COMPRESSION from v$im_segments;
SEGMENT_NAME
------------------------------------------------------------------------------------------------------------------------
     BYTES INMEMORY_SIZE INMEMORY INMEMORY_COMPRESS
---------- ------------- -------- -----------------
T_INMEMORY2
  13631488       4325376 NONE     FOR QUERY LOW

这里测试证明:
1. 指定PRIORITY不为none也需要访问对象后才能够放入IM中
2. 当IM不足时,PRIORITY级别高的会把级别低的对象刷出IM

特此声明:本文仅出自个人测试,得出结论,不可作为任何官方依据使用,具体环境需要具体测试

记录一次rm -rf 删除数据文件异常恢复

因为人员离职闹得不愉快,系统工程师离职后,由于公司未及时关闭其vpn,数据库服务器(Linux 6.5 Oracle 11.2.0.1)帐号未及时被修改,最后直接上去rm ORACLE_BASE给干掉,悲剧的是ORADATA目录也在里面,更加悲剧的是所有数据文件都在里面.也就是说数据库彻底被删除,而且没有任何备份.朋友咨询了我,让我给予支持.最后比较幸运,文件没有被覆盖,inode都还在,通过extundelete顺利恢复所有数据文件,控制文件,redo文件(extundelete恢复Linux被删除文件),数据库顺利打开,实现0丢失,算是一次完美的恢复

[root@DB1 tmp]# tar xvf extundelete-0.2.4.tar
extundelete-0.2.4/
extundelete-0.2.4/acinclude.m4
extundelete-0.2.4/missing
extundelete-0.2.4/autogen.sh
extundelete-0.2.4/aclocal.m4
extundelete-0.2.4/configure
extundelete-0.2.4/LICENSE
extundelete-0.2.4/README
extundelete-0.2.4/install-sh
extundelete-0.2.4/config.h.in
extundelete-0.2.4/src/
extundelete-0.2.4/src/extundelete.cc
extundelete-0.2.4/src/block.h
extundelete-0.2.4/src/kernel-jbd.h
extundelete-0.2.4/src/insertionops.cc
extundelete-0.2.4/src/block.c
extundelete-0.2.4/src/cli.cc
extundelete-0.2.4/src/extundelete-priv.h
extundelete-0.2.4/src/extundelete.h
extundelete-0.2.4/src/jfs_compat.h
extundelete-0.2.4/src/Makefile.in
extundelete-0.2.4/src/Makefile.am
extundelete-0.2.4/configure.ac
extundelete-0.2.4/depcomp
extundelete-0.2.4/Makefile.in
extundelete-0.2.4/Makefile.am
[root@DB1 tmp]# cd extundelete-0.2.4
[root@DB1 extundelete-0.2.4]# ./configure
Configuring extundelete 0.2.4
Writing generated files to disk
[root@DB1 extundelete-0.2.4]# make && make install
make -s all-recursive
Making all in src
Making install in src
  /usr/bin/install -c extundelete '/usr/local/bin'
[root@DB1 extundelete-0.2.4]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       244G   11G  221G   5% /
tmpfs            16G   72K   16G   1% /dev/shm
/dev/sda1       190M   62M  119M  35% /boot
/dev/sdb1       2.0T   71M  1.9T   1% /home
[root@DB1 extundelete-0.2.4]# umount /dev/sdb1
umount: /home: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
/home:                3914c(oracle)  8372c(oracle)
[root@DB1 extundelete-0.2.4]# kill -9 3914
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
/home:                8372c(oracle)
[root@DB1 extundelete-0.2.4]# kill -9 8372
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
[root@DB1 extundelete-0.2.4]# umount /dev/sdb1
[root@DB1 extundelete-0.2.4]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       244G   11G  221G   5% /
tmpfs            16G   72K   16G   1% /dev/shm
/dev/sda1       190M   62M  119M  35% /boot
[root@DB1 extundelete-0.2.4]# extundelete /dev/sdb1 --restore-all
NOTICE: Extended attributes are not restored.
Loading filesystem metadata ... 16384 groups loaded.
Loading journal descriptors ... 26542 descriptors loaded.
Searching for recoverable inodes in directory / ...
18896 recoverable inodes found.
Looking through the directory structure for deleted files ...
2 recoverable inodes still lost.
Unable to restore inode 43778050 (file.43778050): Space has been reallocated.
[root@DB1 extundelete-0.2.4]# ls
acinclude.m4  autogen.sh  config.h.in  config.status  configure.ac  install-sh  Makefile     Makefile.in
aclocal.m4    config.h    config.log   configure      depcomp       LICENSE     Makefile.am  missing
[root@DB1 extundelete-0.2.4]# cd RECOVERED_FILES/
[root@DB1 RECOVERED_FILES]# ls
app  file.43778051  oracle  oraInventory
[root@DB1 RECOVERED_FILES]# cd app
[root@DB1 app]# ls
admin  cfgtoollogs  diag  oracle  oradata  orcl  ORCL
[root@DB1 app]# cd oradata
[root@DB1 oradata]# ls
orcl
[root@DB1 oradata]# cd orcl
[root@DB1 orcl]# ls
control01.ctl  redo01.log  redo02.log  redo03.log  sysaux01.dbf  system01.dbf  undotbs01.dbf  users01.dbf
[root@DB1 orcl]# ls -ltr
total 2908776
-rw-r--r--. 1 root root  734011392 Nov 18 02:06 system01.dbf
-rw-r--r--. 1 root root 1069555712 Nov 18 02:06 sysaux01.dbf
-rw-r--r--. 1 root root  120594432 Nov 18 02:06 undotbs01.dbf
-rw-r--r--. 1 root root  887365632 Nov 18 02:06 users01.dbf
-rw-r--r--. 1 root root    9748480 Nov 18 02:06 control01.ctl
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo01.log
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo02.log
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo03.log
[root@DB1 orcl]#

再次提醒各位:数据库备份重于一切,防天灾的同时还要防人灾,也希望圈子里面以后不要听到类似故障.

In-Memory整体汇总

本问是对于Oracle 12C中的In-Memory Column Store一个整体的汇总,具体细节知识在以后章节中展开
IM可以针对如下级别进行操作
Column
Table
Materialized view
Tablespace
Partition

可以指定In-Memory操作语句
CREATE TABLE
ALTER TABLE
CREATE TABLESPACE
ALTER TABLESPACE
CREATE MATERIALIZED VIEW
ALTER MATERIALIZED VIEW

压缩级别
IM-Compression-Methods
优先级
IM-Priority-Levels
对象级别操作IM

CREATE TABLE t_xifenfei (
     id        NUMBER(5) PRIMARY KEY,
     test_col  VARCHAR2(15))
  INMEMORY;
ALTER TABLE t_xifenfei INMEMORY;
ALTER TABLE t_xifenfei INMEMORY MEMCOMPRESS FOR CAPACITY LOW;
ALTER TABLE t_xifenfei INMEMORY PRIORITY HIGH;
ALTER TABLE t_xifenfei INMEMORY
  MEMCOMPRESS FOR CAPACITY HIGH
  PRIORITY LOW;
ALTER TABLE t_xifenfei
   INMEMORY MEMCOMPRESS FOR QUERY (
      product_id, product_name, category_id, supplier_id, min_price)
   INMEMORY MEMCOMPRESS FOR CAPACITY HIGH (
      product_description, warranty_period, product_status, list_price)
   NO INMEMORY (
      weight_class, catalog_url);
ALTER TABLE t_xifenfei NO INMEMORY;

补充说明:列级别设置的优先级无效,优先级是表(物化视图)或者分区表级别

表空间级别操作IM

CREATE TABLESPACE xifenfie_im
   DATAFILE '/u02/xifenfei.dbf' SIZE 40M
   ONLINE
   DEFAULT INMEMORY;
ALTER TABLESPACE xifenfie_im DEFAULT INMEMORY
   MEMCOMPRESS FOR CAPACITY HIGH
   PRIORITY LOW;

物化视图级别

CREATE MATERIALIZED VIEW oe.prod_info_mv INMEMORY
  AS SELECT * FROM t_xifenfei;
ALTER MATERIALIZED VIEW oe.prod_info_mv INMEMORY PRIORITY HIGH;

适合使用IN-Memory操作
A query that scans a large number of rows and applies filters that use operators such as the following: =, <, >, and IN
A query that selects a small number of columns from a table or materialized view with a large number of columns,
such as a query that selects five columns from a table with 100 columns
A query that joins a small table to a large table
A query that aggregates data

不适合使用IN-Memory操作
Queries with complex predicates
Queries that select a large number of columns
Queries that return a large number of rows
Queries with multiple large table joins

IM控制参数
INMEMORY_SIZE 指定IM分配内存大小,默认值为0,如果启动该值最小为100M;如果在CDB环境中使用,CDB级别设置为整个库级别限制,PDB默认继承CDB设置,但是在实际使用中PDB中总数不能超过CDB限制
INMEMORY_FORCE 指定是否允许数据库中对象使用IM,默认是DEFAULT,即可以实现在对象级别定义INMEMORY or NO INMEMORY,如果设置为OFF 即表示表或者物化视图无法使用IM
INMEMORY_CLAUSE_DEFAULT 默认为空,和NO INMEMORY意义相同,表示创建新对象默认不启用IM,如果配置为INMEMORY,表示新创建对象默认启用IM
INMEMORY_QUERY 默认为TRUE,表示查询是否使用IM特性,设置为FALSE表示查询不使用IM特性
INMEMORY_MAX_POPULATE_SERVERS 默认和系统core一致,用途是把你的表中数据写入到IM中
INMEMORY_TRICKLE_REPOPULATE_SERVERS_PERCENT 控制IM中对象数据的重新载入的进程数,该值为INMEMORY_MAX_POPULATE_SERVERS参数的百分比
OPTIMIZER_INMEMORY_AWARE 该参数是控制优化器成本计算时是否考虑IM,默认为TRUE

impdp 操作IM
TRANSFORM=INMEMORY:y 继承IM导出对象属性
TRANSFORM=INMEMORY:n 不继承IM导出对象属性
TRANSFORM=INMEMORY_CLAUSE:string 修改IM导出对象关于IM的属性

参考文档:https://docs.oracle.com/database/121/ADMIN/memory.htm#ADMIN14257

如何估算表In-Memory需要内存大小

对于12.1.0.2的In-Memory特性很多朋友都已经知晓,现在可能有这个困惑我一张表启用In-Memory大概需要多少内存呢?该如何估算这个值呢?这里我告诉你通过dbms_compression可以完成你想做的事情
启用In-Memory功能

[oracle@www.xifenfei.com u02]$ sqlplus / as sysdba
SQL*Plus: Release 12.1.0.2.0 Production on Thu Aug 7 17:50:47 2014
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
SQL> show parameter inmemory;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default              string
inmemory_force                       string      DEFAULT
inmemory_max_populate_servers        integer     0
inmemory_query                       string      ENABLE
inmemory_size                        big integer 0
inmemory_trickle_repopulate_servers_ integer     1
optimizer_inmemory_aware             boolean     TRUE
SQL> alter system set inmemory_size=400M;
alter system set inmemory_size=400M
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-02095: specified initialization parameter cannot be modified
SQL>  alter system set inmemory_size=400M scope=spfile;
System altered.
SQL> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO
         4 PDB2                           READ WRITE NO
SQL> alter session set container=pdb1;
Session altered.
SQL> show parameter inmemory;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default              string
inmemory_force                       string      DEFAULT
inmemory_max_populate_servers        integer     0
inmemory_query                       string      ENABLE
inmemory_size                        big integer 0
inmemory_trickle_repopulate_servers_ integer     1
optimizer_inmemory_aware             boolean     TRUE
SQL> alter system set inmemory_size=200M;
alter system set inmemory_size=200M
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-02095: specified initialization parameter cannot be modified
SQL> alter system set inmemory_size=200M scope=spfile;
alter system set inmemory_size=200M scope=spfile
                                               *
ERROR at line 1:
ORA-02096: specified initialization parameter is not modifiable with this
option
SQL> !oerr ora 2096
02096, 00000, "specified initialization parameter is not modifiable with this option"
// *Cause: Though the initialization parameter is modifiable, it cannot be
//         modified using the specified command.
// *Action: Check the DBA guide for information about under what scope
//          the parameter may be modified
SQL> select 200*1024*1024 from dual;
200*1024*1024
-------------
    209715200
SQL>  alter system set inmemory_size=209715200;
 alter system set inmemory_size=209715200
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-02095: specified initialization parameter cannot be modified
SQL> shutdown immediate;
Pluggable Database closed.
SQL> conn / as sysdba
Connected.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area  838860800 bytes
Fixed Size                  2929936 bytes
Variable Size             360712944 bytes
Database Buffers           50331648 bytes
Redo Buffers                5455872 bytes
In-Memory Area            419430400 bytes
Database mounted.
Database opened.
SQL> show parameter inmemory;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default              string
inmemory_force                       string      DEFAULT
inmemory_max_populate_servers        integer     1
inmemory_query                       string      ENABLE
inmemory_size                        big integer 400M
inmemory_trickle_repopulate_servers_ integer     1
optimizer_inmemory_aware             boolean     TRUE
SQL> alter session set container=pdb1;
Session altered.
SQL> alter database open;
Database altered.
SQL> show parameter inmemory;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default              string
inmemory_force                       string      DEFAULT
inmemory_max_populate_servers        integer     1
inmemory_query                       string      ENABLE
inmemory_size                        big integer 400M
inmemory_trickle_repopulate_servers_ integer     1
optimizer_inmemory_aware             boolean     TRUE
SQL> alter system set inmemory_size=100M;
System altered.
SQL> show parameter inmemory;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default              string
inmemory_force                       string      DEFAULT
inmemory_max_populate_servers        integer     1
inmemory_query                       string      ENABLE
inmemory_size                        big integer 100M
inmemory_trickle_repopulate_servers_ integer     1
optimizer_inmemory_aware             boolean     TRUE

这里可以发现inmemory_query默认为true,但是inmemory_size默认为0,也就是说In-Memory Option默认是关闭的,如果启用直接设置inmemory_size值即可(该值最小值为100M),但是需要注意在cdb中需要重启后生效,pdb需要待cdb生效后才能够设置

估算表设置In-Memory默认需要内存空间

SQL> create user chf identified by xifenfei;
User created.
SQL> grant dba to chf;
Grant succeeded.
SQL> create table chf.t_xifenfei as select * from dba_objects;
Table created.
SQL> select sum(bytes), sum(blocks) from DBA_segments where segment_name = 'T_XIFENFEI';
SUM(BYTES) SUM(BLOCKS)
---------- -----------
  13631488        1664
SQL> set serveroutput on
DECLARE
l_blkcnt_cmp binary_integer;
l_blkcnt_uncmp binary_integer;
l_row_cmp binary_integer;
l_row_uncmp binary_integer;
l_cmp_ratio number;
SQL>   2    3    4    5    6    7  l_comptype_str varchar2(100);
  8  BEGIN
  9  dbms_compression.get_compression_ratio(
 10  scratchtbsname => upper('&ScratchTBS'),
 11  ownname => upper('&ownername'),
 12  objname => upper('&TableName'),
 13  subobjname => NULL,
 14  comptype => DBMS_COMPRESSION.COMP_INMEMORY_QUERY_LOW,
 15  blkcnt_cmp => l_blkcnt_cmp,
 16  blkcnt_uncmp => l_blkcnt_uncmp,
 17  row_cmp => l_row_cmp,
 18  row_uncmp => l_row_uncmp,
 19  cmp_ratio => l_cmp_ratio,
 20  comptype_str => l_comptype_str
 21  );
 22  dbms_output.put_line('.');
 23  dbms_output.put_line('OUTPUT: ');
 24  dbms_output.put_line('LINEORDER '||l_comptype_str||' ratio: '||to_char(l_cmp_ratio,'99.999'));
 25  end;
 26  /
Enter value for scratchtbs: USERS
old  10: scratchtbsname => upper('&ScratchTBS'),
new  10: scratchtbsname => upper('USERS'),
Enter value for ownername: CHF
old  11: ownname => upper('&ownername'),
new  11: ownname => upper('CHF'),
Enter value for tablename: T_XIFENFEI
old  12: objname => upper('&TableName'),
new  12: objname => upper('T_XIFENFEI'),
.
OUTPUT:
LINEORDER "In-memory Memcompress Query Low" ratio:   2.800
PL/SQL procedure successfully completed.
SQL> SELECT 13631488/2.800 FROM DUAL;
13631488/2.800
--------------
    4868388.57
SQL> alter table CHF.T_XIFENFEI inmemory;
Table altered.
SQL> select COUNT(*) FROM CHF.T_XIFENFEI;
  COUNT(*)
----------
     90923
SQL> select inmemory_size from v$im_segments where segment_name = 'T_XIFENFEI';
INMEMORY_SIZE
-------------
      4325376
SQL> select (4868388.57-4325376)/4325376 from dual;
(4868388.57-4325376)/4325376
----------------------------
                  .125541125

这里使用大家在ehcc中熟悉的dbms_compression.get_compression_ratio来估算In-memory需要的大概空间,例如本测试中,创建T_XIFENFEI表占用磁盘空间为13631488byte,使用dbms_compression估算在In-memory默认的压缩比例(Query Low)情况下,大概压缩比例为2.8,也就是通过合理估算,表启用In-memory cache之后,大概需要空间为13631488/2.800=4868388.57byte,最终通过实际测试需要空间为4325376byte,整体误差为(4868388.57-4325376)/4325376=12.5%左右
因此我们在使用In-memory cache一个表之时,如果不确定其需要内存大小,可以通过dbms_compression包来估算.另外In-memory还可以配置不同的压缩级别实现不同的压缩比例,其他压缩比例请见下图
compress-in_memory-1
compress-in_memory-2
详细link请见:https://docs.oracle.com/database/121/ARPLS/d_compress.htm#ARPLS65599

Oracle 异常恢复案例汇总

在2014年11月11日来临之际,我整理Blog中和异常恢复案例相关的部分文章,供大家参考:
dul处理分区表
误删除dual表恢复
dul恢复drop表测试
跳过obj$坏块方法
bbed解决ORA-01190
exp dmp文件损坏恢复
当前联机日志损坏恢复
ORA-01578坏块解决(1)
ORA-01578坏块解决(2)
undo异常处理步骤(9i)
DUL挖ORACLE 8.0数据库
undo异常处理步骤(10g)
ORA-600 2663 故障恢复
dul恢复truncate表测试
bbed处理ORA-01200故障
dul 10支持oracle 11g r2
使用 dul 挖数据文件初试
sysaux数据文件异常恢复
ORA-01244/ORA-01110解决
恢复被rm意外删除数据文件
ORA-00600[4194]故障解决
ORA-01207/ORA-00338恢复
DUL10直接支持ORACLE 8.0
bbed修改undo$(回滚段)状态
记录8.0.5数据库恢复过程
ORA-600[4194]/[4193]解决
使用rman找回被误删除表空间
记录一次系统回滚段坏块恢复
使用bbed解决ORA-00600[2662]
ORA-00600[kcfrbd_3]故障解决
数据库启动ORA-08103故障恢复
asm disk header 彻底损坏恢复
数据库恢复遭遇ORA-00600[3705]
Oracle 11g丢失access$恢复方法
undo segment header坏块异常恢复
ORA-600 kghstack_free2异常恢复
ORA-00600[kccpb_sanity_check_2]
重建控制文件引发ORA-00218故障
dul支持ORACLE 12C CDB数据库恢复
ORACLE 8.0.5 ORA-01207故障恢复
dul 10 export_mode=true功能增强
完美解决dul处理clob字段乱码问题
手工修复ASM DISK HEADER 异常
undo坏块导致数据库异常终止案
bbed 恢复 GLOBAL_NAME 为空故障
通过bbed解决ORA-00600[4000]案例
重建控制文件丢失数据文件导致悲剧
异常断电导致current redo损坏处理
创建控制文件遭遇ORA-600 kccscf_1
使用bbed修复损坏datafile header
通过修改控制文件scn推进数据库scn
bbed打开丢失部分system数据文件库
某集团ebs数据库redo undo丢失导致悲剧
obj$坏块exp/expdp导出不能正常执行
处理fast_recovery_area无剩余空间案例
obj$坏块情况下exp导出单个表解决方案
ORA-00600 [ktbdchk1: bad dscn] 解决
记录因磁盘头被重写,抢救redo恢复经历
rac redo log file被意外覆盖数据库恢复
使用bbed让rac中的sysaux数据文件online
通过bbed修改回滚段状态解决ORA-00704故障
处理smon清理临时段导致数据库异常案例
使用DUL挖数据文件恢复非数据外对象方法
一次侥幸的OSD-04016 O/S-Error异常恢复
记录一次ORA-600 4000数据库故障恢复
一起ORA-600 3020故障恢复的大体思路
redo异常 ORA-600 kclchkblk_4 故障恢复
Oracle 12C的第一次异常恢复—文件头坏块
数据库启动报ORA-00704 ORA-39714错误解决
ORACLE 8.1.7 数据库ORA-600 4000故障恢复
数据库报ORA-00607/ORA-00600[4194]错误
创建控制文件遭遇ORA-00600[3753]故障解决
ORA-00600[kcbshlc_1]导致数据库 down 案例
ORACLE 8.1.7 数据库ORA-600 4194故障恢复
ORA-00600[kcrf_resilver_log_1]异常恢复
控制文件异常导致ORA-00600[kccsbck_first]
分享一次ORA-01113 ORA-01110故障处理过程
记录一次ORA-600 3004 恢复过程和处理思路
Oracle安全警示录:加错裸设备导致redo异常
ORA-600 kcratr_nab_less_than_odr故障解决
双机mount数据库出现ORA-00600[kccsbck_first]
又一起存储故障导致ORA-00333 ORA-00312恢复
通过bbed模拟ORA-00607/ORA-00600 4194 故障
记录一次ORA-00316 ORA-00312 redo异常恢复
asmlib异常报ORA-00600[kfklLibFetchNext00]
某个pdb可以在root pdb open状态下进行恢复
使用bbed解决ORA-00607/ORA-00600[4194]故障
乱用_allow_resetlogs_corruption参数导致悲剧
遭遇ORA-07445[kkdliac()+346]使用odu抢救数据
ORA-27069: skgfdisp: 尝试在文件范围外执行 I/O
创建控制文件出现ORA-01565 ORA-27041 OSD-04002
记录一次system表空间坏块(ORA-01578)数据库恢复
模拟基表事务未提交数据库crash,undo丢失恢复异常恢复
重建控制文件丢失undo异常恢复—ORA-01173模拟与恢复
修改props$.NLS_CHARACTERSET导致ORA-00900异常恢复
ORA-600[2037]与ORA-07445[kcbs_dump_adv_state]错误
误drop tablespace后使用flashback database闪回异常处理
数据库恢复历史再次刷新到Oracle 7.3.2版本—redo异常恢复
ORA-27086: skgfglk: unable to lock file – already in use
ORACLE 12C ORA-07445[ktuHistRecUsegCrtMain()+1173]恢复
ORA-00600[kcratr1_lostwrt]/ORA-00600[3020]错误恢复
表空间online出现ORA-00600[kcbz_check_objd_typ]处理过程
_allow_resetlogs_corruption和adjust_scn解决ORA-01190
spfile被覆盖导致ORA-600[kmgs_parameter_update_timeout_1]
重建控制文件丢失undo异常恢复—ORA-600 25025模拟与恢复
使用_allow_resetlogs_corruption打开无归档日志rman备份库
使用_allow_resetlogs_corruption导致ORA-00704/ORA-01555故障
记录一次ORA-600 kccpb_sanity_check_2和ORA-600 kcbgtcr_13 错误恢复
ORA-00600[17182],ORA-00600[25027],ORA-00600[kghfrempty:ds]故障处理
因RAC的undo_management参数不一致导致数据库mount报ORA-01105 ORA-01606
使用bbed解决ORA-01178 file N created before last CREATE CONTROLFILE, cannot recreate
通过多次resetlogs规避类似ORA-01248: file N was created in the future of incomplete recovery错误
bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决
记录一次ORA-00600[kdxlin:psno out of range]/ORA-00600[3020]/ORA-00600[4000]/ORA-00600[4193]的数据库恢复

当你的数据库因为异常断电,强制关机,硬盘故障,drop表,truncate表,delete表,dmp文件异常,asm无法正常mount等故障无法解决导致数据丢失,且无法自行解决,请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445    E-Mail:dba@xifenfei.com