又一例存储cache丢失oracle数据库恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:又一例存储cache丢失oracle数据库恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

10.2.0.5 hp unix rac,由于存储掉电导致cache丢失,数据库无法正常启动,客户要求我们介入处理
数据库mount报ORA-00600 kccpb_sanity_check_2错误

Thu Jul 22 14:52:06 EAT 2021
alter database mount
Thu Jul 22 14:52:10 EAT 2021
Errors in file /oracle/admin/xff/udump/xff1_ora_4611.trc:
ORA-00600: internal error code, arguments: [kccpb_sanity_check_2], [4697564], [4697561], [0x000000000], [], [], [], []

该错误是由于控制文件损坏,尝试重建控制文件报ORA-01163,ORA-01517

'/dev/oradata/rxff_ls94'
CHARACTER SET ZHS16GBK
WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Thu Jul 22 14:54:02 EAT 2021
Errors in file /oracle/admin/xff/udump/xff1_ora_7283.trc:
ORA-01163: SIZE clause indicates 262144 (blocks), but should match header 204800
ORA-01517: log member: '/dev/oradata/rxff_redo1_1'
ORA-1503 signalled during: CREATE CONTROLFILE REUSE DATABASE "xff" NORESETLOGS  NOARCHIVELOG

由于redo大小错误导致该问题,设置正确的redo大小继续重建

'/dev/oradata/rxff_ls94'
CHARACTER SET ZHS16GBK
WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Thu Jul 22 15:01:00 EAT 2021
Errors in file /oracle/admin/xff/udump/xff1_ora_14737.trc:
ORA-00600: internal error code, arguments: [kccsga_update_ckpt_4], [32], [8], [], [], [], [], []
Thu Jul 22 15:01:01 EAT 2021
Errors in file /oracle/admin/xff/udump/xff1_ora_14737.trc:
ORA-00600: internal error code, arguments: [kccsga_update_ckpt_4], [32], [8], [], [], [], [], []
ORA-1503 signalled during: CREATE CONTROLFILE REUSE DATABASE "xff" NORESETLOGS  NOARCHIVELOG

报ORA-00600 kccsga_update_ckpt_4错误,导致控制文件失败,处理该错误之后,重建控制文件成功,分析文件头信息和redo信息,确认只能强制库,尝试强制open库

Thu Jul 22 16:02:05 EAT 2021
SMON: enabling cache recovery
Thu Jul 22 16:02:05 EAT 2021
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0002.cdad19ed):
Thu Jul 22 16:02:05 EAT 2021
select ctime, mtime, stime from obj$ where obj# = :1
Thu Jul 22 16:02:05 EAT 2021
Errors in file /oracle/admin/xff/udump/xff1_ora_23219.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 19 with name "_SYSSMU19$" too small
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 23219
ORA-1092 signalled during: alter database open resetlogs...

这个问题比较常见:ORA-00704 ORA-00604 ORA-01555,参考类似文章:
在数据库open过程中常遇到ORA-01555汇总
数据库open过程遭遇ORA-1555对应sql语句补充
数据库open成功但是报ORA-00600 4137

Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan 
Thu Jul 22 16:08:48 EAT 2021
Errors in file /oracle/admin/xff/bdump/xff1_smon_27436.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=30, OS id=997
Thu Jul 22 16:08:49 EAT 2021
LOGSTDBY: Validating controlfile with logical metadata
Thu Jul 22 16:08:49 EAT 2021
ORACLE Instance xff1 (pid = 11) - Error 600 encountered while recovering transaction (1, 43).
Thu Jul 22 16:08:49 EAT 2021
Errors in file /oracle/admin/xff/bdump/xff1_smon_27436.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Thu Jul 22 16:08:49 EAT 2021
Trace dumping is performing id=[cdmp_20210722160849]
Thu Jul 22 16:08:49 EAT 2021
LOGSTDBY: Validation complete
Completed: alter database open

该问题是由于undo异常,对undo进行处理,数据库无明显报错,安排导出数据

ORA-01092: ORACLE 例程终止 故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-01092: ORACLE 例程终止 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报ORA-01092: ORACLE 例程终止。强行断开连接 错误

SQL> RECOVER DATABASE;
完成介质恢复。
SQL> ALTER DATABASE OPEN;
ALTER DATABASE OPEN
*
ERROR 位于第 1 行:
ORA-01092: ORACLE 例程终止。强行断开连接

查看alert日志

Wed Jul 21 12:32:04 2021
SMON: enabling cache recovery
Wed Jul 21 12:32:04 2021
Errors in file c:\oracle\admin\dcpdm\udump\dcpdm_ora_3004.trc:
ORA-00600: ?????????: [4194], [34], [8], [], [], [], [], []

Wed Jul 21 12:32:05 2021
Recovery of Online Redo Log: Thread 1 Group 2 Seq 495 Reading mem 0
  Mem# 0 errs 0: C:\ORACLE\ORADATA\DCPDM\REDO02.LOG
Recovery of Online Redo Log: Thread 1 Group 2 Seq 495 Reading mem 0
  Mem# 0 errs 0: C:\ORACLE\ORADATA\DCPDM\REDO02.LOG
Wed Jul 21 12:32:05 2021
Errors in file c:\oracle\admin\dcpdm\udump\dcpdm_ora_3004.trc:
ORA-00604: ?? SQL ? 1 ????
ORA-00607: ?????????????
ORA-00600: ?????????: [4194], [34], [8], [], [], [], [], []

Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Wed Jul 21 12:32:05 2021
Errors in file c:\oracle\admin\dcpdm\bdump\dcpdm_pmon_13020.trc:
ORA-00604: error occurred at recursive SQL level 

Instance terminated by USER, pid = 3004
ORA-1092 signalled during: ALTER DATABASE OPEN...

trace文件信息

*** 2021-07-21 12:32:04.000
ksedmp: internal or fatal error
ORA-00600: ?????????: [4194], [34], [8], [], [], [], [], []
Current SQL statement for this session:
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,
scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
_ksedmp+147          CALLrel  _ksedst+0            
_ksfdmp.108+e        CALLrel  _ksedmp+0            3
_kgeriv+89           CALLreg  00000000             4E59D98 3
_kseipre.107+3f      CALLrel  _kgeriv+0            
_ksesic2+24          CALLrel  _kseipre.107+0       
__VInfreq__kturdb+8  CALLrel  _ksesic2+0           1062 0 22 0 8
b                                                  
_kcoapl+1df          CALLreg  00000000             2BB0F94 2BB100A 11 6C37C014
_kcbapl+71           CALLrel  _kcoapl+0            2BB0F90 6C37C000 1 0 2000
_kcrfwr+734          CALLrel  _kcbapl+0            2BB0F90 6C3FC788 50D4FA0
_kcbchg1+7ec         CALLrel  _kcrfwr+0            
_ktuchg+630          CALLrel  _kcbchg1+0           0 4 50D5228 50D5240 0 0
_ktbchg2+75          CALLrel  _ktuchg+0            2 66F589A4 1 2C8CD14 2C8CD1C
                                                   2BB0F90 2C8C32C 2BB0ED0 0 0
_kddchg+18f          CALLrel  _ktbchg2+0           0 66F589A4 2C8CD14 2C8CD1C
                                                   2BB0F90 2C8C324 2BB0ED0 0 0
_kduovw.53+6e3       CALLrel  _kddchg+0            2C8C2E8 2C8CD14 2C8CD1C
                                                   2BB0F90 2BB0ED0 0 0
_kduurp.53+61a       CALLrel  _kduovw.53+0         2C8C2E8
_kdusru+aa5          CALLrel  _kduurp.53+0         2C8C2E8 66F589FC
_kauupd+12e          CALLrel  _kdusru+0            2C8C71C 66F589FC 2C8C2E8 0
_updrow+729          CALLrel  _kauupd+0            2C8C718 66F589FC 2C8C2E8 0
                                                   66F58448 E F 66F60EE0 12
                                                   50DBBA4 50DBBA8
_qerupFetch+107      CALLrel  _updrow+0            
_updaul+202          CALL???  00000000             66F58660 0 66F6BC3C 7FFF
_updThreePhaseExe+b  CALLrel  _updaul+0            66F6B9D0 50DBD34 0
6                                                  
_updexe+105          CALLrel  _updThreePhaseExe+0  66F6B9D0 0 2C8C2E8 50DBE10
                                                   66F6B9D0 1 50DBE10 0
_opiexe+f97          CALLrel  _updexe+0            66F6B9D0 50DBF4C
_opiodr+4cd          CALLreg  00000000             4 3 50DC898
_rpidrus.43+99       CALLrel  _opiodr+0            4 3 50DC898 A
_skgmstack+71        CALLreg  00000000             50DC488
_rpidru+6d           CALLrel  _skgmstack+0         50DC4A0 4E59C20 F618 778198
                                                   50DC488
_rpiswu2+17e         CALLreg  00000000             50DC7C0
_rpidrv+109          CALLrel  _rpiswu2+0           
_rpiexe+33           CALLrel  _rpidrv+0            A 4 50DC898 8
_ktuscu+2a8          CALLrel  _rpiexe+0            A
_kqrcmt+2c2          CALL???  00000000             66F6D654 3
..1.18_2.filter.95+  CALLrel  _kqrcmt+0            67B88CD4 1 0 4E59D98 4E59D98
159                                                FF 0 0 0
..1.23_5.filter.99+  CALLrel  _ktcrcm+0            67B88CD4 0 0 0 0 1 0 0
14d                                                
_ktuini+64           CALLrel  _ktuiup.99+0         50DD994
_adbdrv+2665         CALLrel  _ktuini+0            50DD994
..1.5_1.filter.29+2  CALLrel  _adbdrv+0            
9d                                                 
_opiosq0+9a4         CALLrel  _opiexe+0            4 0 50DDDDC
_kpooprx+c6          CALLrel  _opiosq0+0           3 E 50DDE74 24
_kpoal8+225          CALLrel  _kpooprx+0           50DE73C 50DE684 13 1 0 24
_opiodr+4cd          CALLreg  00000000             5E 14 50DE738
_ttcpip+a86          CALLreg  00000000             5E 14 50DE738 0
_opitsk+2f4          CALLrel  _ttcpip+0            
_opiino+5fc          CALLrel  _opitsk+0            0 0 4E5FEE8 2BDF044 F3 0
_opiodr+4cd          CALLreg  00000000             3C 4 50DFBD8
_opidrv+233          CALLrel  _opiodr+0            3C 4 50DFBD8 0
_sou2o+19            CALLrel  _opidrv+0            
_opimai+10a          CALLrel  _sou2o+0             
_OracleThreadStart@  CALLrel  _opimai+0            
4+35c                                              
7C824826             CALLreg  00000000             
 
--------------------- Binary Stack Dump ---------------------

比较明显时候由于在更新undo$的时候需要找前镜像信息

Block image after block recovery:
buffer tsn: 0 rdba: 0x0040018b (1/395)
scn: 0x0000.07d52871 seq: 0x01 flg: 0x04 tail: 0x28710201
frmt: 0x02 chkval: 0xc85e type: 0x02=KTU UNDO BLOCK
 
********************************************************************************
UNDO BLK:  
xid: 0x0000.05a.0000002d  seq: 0x33  cnt: 0x22  irb: 0x22  icl: 0x0   flg: 0x0000
 
 Rec Offset      Rec Offset      Rec Offset      Rec Offset      Rec Offset
---------------------------------------------------------------------------
0x01 0x1f04     0x02 0x1e20     0x03 0x1d3c     0x04 0x1c58     0x05 0x1b74     
0x06 0x1a90     0x07 0x19ac     0x08 0x18c8     0x09 0x17e4     0x0a 0x1700     
0x0b 0x161c     0x0c 0x1538     0x0d 0x1454     0x0e 0x1370     0x0f 0x128c     
0x10 0x11a8     0x11 0x10c4     0x12 0x0fe0     0x13 0x0efc     0x14 0x0e18     
0x15 0x0d34     0x16 0x0c50     0x17 0x0b6c     0x18 0x0a88     0x19 0x09a4     
0x1a 0x08c0     0x1b 0x07dc     0x1c 0x06f8     0x1d 0x0614     0x1e 0x0530     
0x1f 0x044c     0x20 0x0368     0x21 0x0284     0x22 0x01a0     
 
*-----------------------------
* Rec #0x1  slt: 0x0b  objn: 15(0x0000000f)  objd: 15  tblspc: 0(0x00000000)
*       Layer:  11 (Row)   opc: 1   rci 0x00   
Undo type:  Regular undo    Begin trans    Last buffer split:  No 
Temp Object:  No 
Tablespace Undo:  No 
rdba: 0x00000000
*-----------------------------
uba: 0x0040018a.0033.22 ctl max scn: 0x0000.07853941 prv tx scn: 0x0000.07853943
KDO undo record:
KTB Redo 
op: 0x04  ver: 0x01  
op: L  itl: xid:  0x0000.042.0000002d uba: 0x0040018a.0033.22
                      flg: C---    lkc:  0     scn: 0x0000.07d23460
KDO Op code: URP row dependencies Disabled
  xtype: XA  bdba: 0x0040006a  hdba: 0x00400069
itli: 1  ispac: 0  maxfr: 4863
tabn: 0 slot: 7(0x7) flag: 0x2c lock: 0 ckix: 0
ncol: 17 nnew: 12 size: 0
col  1: [ 9]  5f 53 59 53 53 4d 55 37 24
col  2: [ 2]  c1 02
col  3: [ 2]  c1 03
col  4: [ 3]  c2 02 06
col  5: [ 6]  c5 02 20 14 40 24
col  6: [ 1]  80
col  7: [ 4]  c3 0e 21 2d
col  8: [ 3]  c2 1b 34
col  9: [ 1]  80
col 10: [ 2]  c1 03
col 11: [ 2]  c1 02
col 16: [ 2]  c1 02

这部分信息异常,导致数据库update undo$的时候报ORA-00600: ?????????: [4194], [34], [8], [], [], [], [], []错误,通过修改对应的block信息,数据库正常open成功

SQL> alter database open;

数据库已更改。

但是关闭数据库又报ORA-600 4194错误

SQL> shutdown immediate;
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码,参数: [4194], [94], [61], [], [], [], [], []

alert日志信息

Wed Jul 21 12:58:42 2021
Shutting down instance: further logons disabled
Shutting down instance (immediate)
License high water mark = 3
Waiting for dispatcher 'D000' to shutdown
All dispatchers and shared servers shutdown
Wed Jul 21 12:58:45 2021
ALTER DATABASE CLOSE NORMAL
Wed Jul 21 12:58:45 2021
Errors in file c:\oracle\admin\dcpdm\udump\dcpdm_ora_13628.trc:
ORA-00600: 内部错误代码,参数: [4194], [94], [61], [], [], [], [], []

Recovery of Online Redo Log: Thread 1 Group 3 Seq 496 Reading mem 0
  Mem# 0 errs 0: C:\ORACLE\ORADATA\DCPDM\REDO03.LOG
Recovery of Online Redo Log: Thread 1 Group 3 Seq 496 Reading mem 0
  Mem# 0 errs 0: C:\ORACLE\ORADATA\DCPDM\REDO03.LOG
ORA-607 signalled during: ALTER DATABASE CLOSE NORMAL...

通过重建undo,数据库启动关闭正常,也没有再报其他错误,建议逻辑方式重建库
参考以前的类似文章:
数据库报ORA-00607/ORA-00600[4194]错误
使用bbed解决ORA-00607/ORA-00600[4194]故障
使用bbed解决ORA-00607/ORA-00600[4194]故障

ora-600 kfdpMetaBlk_pickle 故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ora-600 kfdpMetaBlk_pickle 故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户反馈集群的crs无法正常启动观察发现是由于gmon进程crash asm实例导致,经过测试确认是在mount data磁盘组的时候会触发给问题

SQL> alter diskgroup data mount;
alter diskgroup data mount
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 7517
Session ID: 918 Serial number: 5

对应的alert日志报ORA-600 [kfdpMetaBlk_pickle:01], [4294967295]错误

SQL> alter diskgroup data mount
NOTE: cache registered group DATA number=2 incarn=0x3078f05f
NOTE: cache began mount (first) of group DATA number=2 incarn=0x3078f05f
NOTE: Assigning number (2,1) to disk (/dev/rdisk/disk93)
NOTE: Assigning number (2,3) to disk (/dev/rdisk/disk96)
NOTE: Assigning number (2,2) to disk (/dev/rdisk/disk94)
NOTE: Assigning number (2,0) to disk (/dev/rdisk/disk92)
Sat Jul 17 05:21:01 2021
Errors in file /u01/app/crs_base/diag/asm/+asm/+ASM2/trace/+ASM2_gmon_7457.trc  (incident=255833):
ORA-00600: internal error code, arguments: [kfdpMetaBlk_pickle:01], [4294967295], [0], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/crs_base/diag/asm/+asm/+ASM2/incident/incdir_255833/+ASM2_gmon_7457_i255833.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/crs_base/diag/asm/+asm/+ASM2/trace/+ASM2_gmon_7457.trc:
ORA-00600: internal error code, arguments: [kfdpMetaBlk_pickle:01], [4294967295], [0], [], [], [], [], [], [], [], [], []
GMON (ospid: 7457): terminating the instance due to error 493
Sat Jul 17 05:21:03 2021
System state dump requested by (instance=2, osid=7457 (GMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/crs_base/diag/asm/+asm/+ASM2/trace/+ASM2_diag_7429.trc
Instance terminated by GMON, pid = 7457

对于ORA-600 [kfdpMetaBlk_pickle:01], [4294967295]错误,查询了mos没有任何有效信息
kfdpMetaBlk_pickle


对应的trace文件发现如下信息

2021-07-17 03:51:16.277603*:800002A2:KGF:kgfdputl.c@1411:kgfdpMetaSet_getMaxClique():   inc=2 ver=4294967295
2021-07-17 03:51:16.277619 :800002A3:KFDP:kfdp.c@9314:kfdpMetaSet_filterOld(): filtered old meta on disk 2
2021-07-17 03:51:16.277620 :800002A4:KFDP:kfdp.c@9314:kfdpMetaSet_filterOld(): filtered old meta on disk 2
2021-07-17 03:51:16.277992 :800002A5:KFDP:kfdp.c@9417:kfdpMetaSet_readDta():kfdpMetaSet_readDta unpickle upto 6 metablks
2021-07-17 03:51:16.277993 :800002A6:KFDP:kfdp.c@9425:kfdpMetaSet_readDta():kfdpMetaSet_readDta unpickle metablk for disk 3
2021-07-17 03:51:16.278154 :800002A7:KFDP:kfdp.c@9425:kfdpMetaSet_readDta():kfdpMetaSet_readDta unpickle metablk for disk 1
2021-07-17 03:51:16.278268 :800002A8:KFDP:kfdp.c@5851:kfdp_read(): kfdp_read end ok=1
2021-07-17 03:51:16.278277 :800002A9:KFDP:kfdp.c@7073:kfdp_doQuery(): kfdp_doQuery   rewrite_kfdp=1
2021-07-17 03:51:16.278282 :800002AA:KFDP:kfdp.c@12511:kfdpLckValue_pickle(): kfdpLckValue_pickle size=0 
                            endian=0xff ndisks=0 lckvalid=0
2021-07-17 03:51:16.278293 :800002AB:db_trace:kfdp.c@12803:kfdpLck_convPriv(): [10499:19:396] 
                            kfdpLck_conv: grp=1, type=0, mode=5, line=7155
2021-07-17 03:51:16.278294 :800002AC:KFDP:kfdp.c@12663:kfdpLckValue_unpickle(): kfdpLckValue_unpickle
                            size=28 res=0 ok=0 ver=-1 dcnt=0 lckvalid=0 flags=0x2 inst=0 (I am 2) version=0
2021-07-17 03:51:16.278499*:800002AD:KGF:kgfdputl.c@485:kgfdpDta_getAllDsks(): kgfdpDta_getAllDsks using 
                            saved iterator 0x9ffffffffd571220 with 4 disks
2021-07-17 03:51:16.278688 :800002AE:KFDP:kfdp.c@5566:kfdp_write(): kfdp_write: pstDskCnt=3 grow=0 degenerate=0
2021-07-17 03:51:16.278688*:800002AF:KGF:kgfdputl.c@2619:kgfdpTraceSet(): writing pst to disks (n=3): 0 1 3

通过删除信息,基本上可以确认由于pst信息异常(pst中记录的只有0 1 3三个磁盘,认为2是老磁盘),但是实际磁盘为4个,导致gmon进程异常.通过底层解决该问题,数据库恢复成功

SQL> recover database using backup controlfile;
ORA-00279: change 30075814973 generated at 07/17/2021 01:12:08 needed for
thread 2
ORA-00289: suggestion : +FRA
ORA-00280: change 30075814973 for thread 2 is in sequence #120561


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/tmp/asm/group_16
ORA-00279: change 30075814973 generated at 07/17/2021 01:11:54 needed for
thread 1
ORA-00289: suggestion :
+FRA/xff/archivelog/2021_07_17/thread_1_seq_79949.1543.1078103529
ORA-00280: change 30075814973 for thread 1 is in sequence #79949


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/tmp/asm/group_13
ORA-00279: change 30075815013 generated at 07/17/2021 01:12:09 needed for
thread 1
ORA-00289: suggestion : +FRA
ORA-00280: change 30075815013 for thread 1 is in sequence #79950
ORA-00278: log file '/tmp/asm/group_13' no longer needed for this recovery


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/tmp/asm/group_11
Log applied.
Media recovery complete.

SQL> alter database open resetlogs;

Database altered.

运气不错,对于该故障的恢复,实现数据0丢失.

删除分区 oracle asm disk 恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:删除分区 oracle asm disk 恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到一个朋友数据库故障请求case.大概操作是这样的:有一个39T的lun,通过parted分了15个分区,给oracle asm使用创建磁盘组data4,然后分了4个分区做成data5(由于ausize写错误了),删除掉磁盘组和这四个分区.然后重新分配了6个分区,并且使用最后5个分区创建了data5磁盘组.使用了一段时间之后,由于oracle空间不足,检查的时候误以为这个lun就前面15个分区使用,人工把后面的6个分区给删除了,并且创建了4个新分区,然后发现数据库crash了,发现误删除了在使用的分区.然后又把新创建的4个分区给删除了.接手该故障的时候,这个39T lun的分区信息如下

[root@node1 linux64]# parted /dev/mapper/36000d31003d39e000000000000000004
GNU Parted 2.1
Using /dev/mapper/36000d31003d39e000000000000000004
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print                                                            
Model: Linux device-mapper (multipath) (dm)
Disk /dev/mapper/36000d31003d39e000000000000000004: 39.6TB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  Start   End     Size    File system  Name         Flags
 1      2097kB  2000GB  2000GB               asmdata4-1
 2      2000GB  4000GB  2000GB               asmdata4-2
 3      4000GB  6000GB  2000GB               asmdata4-3
 4      6000GB  8000GB  2000GB               asmdata4-4
 5      8000GB  10.0TB  2000GB               asmdata4-5
 6      10.0TB  12.0TB  2000GB               asmdata4-6
 7      12.0TB  14.0TB  2000GB               asmdata4-7
 8      14.0TB  16.0TB  2000GB               asmdata4-8
 9      16.0TB  18.0TB  2000GB               asmdata4-9
10      18.0TB  20.0TB  2000GB               asmdata4-10
11      20.0TB  22.0TB  2000GB               asmdata4-11
12      22.0TB  24.0TB  2000GB               asmdata4-12
13      24.0TB  26.0TB  2000GB               asmdata4-13
14      26.0TB  28.0TB  2000GB               asmdata4-14
15      28.0TB  30.0TB  2000GB               asmdata4-15

客户正常使用情况下,这个lun上面相关分区的asm disk信息

 SQL> CREATE DISKGROUP DATA4 EXTERNAL REDUNDANCY  DISK 
'/dev/mapper/36000d31003d39e000000000000000004p1' SIZE 1907346M ,
 '/dev/mapper/36000d31003d39e000000000000000004p2' SIZE 1907350M ,
 '/dev/mapper/36000d31003d39e000000000000000004p3' SIZE 1907348M ,
 '/dev/mapper/36000d31003d39e000000000000000004p4' SIZE 1907348M ,
 '/dev/mapper/36000d31003d39e000000000000000004p5' SIZE 1907350M  
ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='8M' /* ASMCA */ 

SQL> ALTER DISKGROUP DATA4 ADD  DISK 
'/dev/mapper/36000d31003d39e000000000000000004p10' SIZE 1907348M ,
'/dev/mapper/36000d31003d39e000000000000000004p6' SIZE 1907348M ,
'/dev/mapper/36000d31003d39e000000000000000004p7' SIZE 1907348M ,
'/dev/mapper/36000d31003d39e000000000000000004p8' SIZE 1907350M ,
'/dev/mapper/36000d31003d39e000000000000000004p9' SIZE 1907348M /* ASMCA */ 

SQL> ALTER DISKGROUP DATA4 ADD  DISK 
'/dev/mapper/36000d31003d39e000000000000000004p11' SIZE 1907348M ,
'/dev/mapper/36000d31003d39e000000000000000004p12' SIZE 1907350M ,
'/dev/mapper/36000d31003d39e000000000000000004p13' SIZE 1907348M ,
'/dev/mapper/36000d31003d39e000000000000000004p14' SIZE 1907348M ,
'/dev/mapper/36000d31003d39e000000000000000004p15' SIZE 1907350M /* ASMCA */ 

SQL> CREATE DISKGROUP DATA5 EXTERNAL REDUNDANCY  DISK 
'/dev/mapper/36000d31003d39e000000000000000004p17' SIZE 1716614M ,
'/dev/mapper/36000d31003d39e000000000000000004p18' SIZE 1716614M ,
'/dev/mapper/36000d31003d39e000000000000000004p19' SIZE 1716614M ,
'/dev/mapper/36000d31003d39e000000000000000004p20' SIZE 1716614M ,
'/dev/mapper/36000d31003d39e000000000000000004p21' SIZE 1621246M  
ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='4M' /* ASMCA */ 

基于客户现在的情况,data4中的所有分区都正常,主要是要找出来data5中的5个分区的数据.因为客户不确定p16分区大小,导致后续的5个分区起始位置不好定位.从而使得恢复无法进行.通过shell脚本结合kfed尝试定位asm disk header信息

#!/bin/bash
j=xxxxxxxxxxx
for ((i=xxxxxx; i<=j; i++))
do
 echo "-----$i--------" >> /home/get_au1.txt
 kfed read /dev/mapper/36000d31003d39e000000000000000004 aun=$i |
 > grep  "kfdhdb.dskname:              DATA" >> /home/get_au.txt
done

结果发现无法获取到结果,通过分析发现这里由于lun过大,导致aun值过大,从而使得kfed溢出无法读取到正常值.根据parted的特性,人工dd部分block进行分析

[root@node1 bak]# kfed read xifenfei.dd aun=134|more
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3357988283 ; 0x00c: 0xc826d5bb
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:         ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]:            0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]:            0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:              DATA5_0000 ; 0x028: length=10
kfdhdb.grpname:                   DATA5 ; 0x048: length=5
kfdhdb.fgname:               DATA5_0000 ; 0x068: length=10
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.crestmp.hi:             33116450 ; 0x0a8: HOUR=0x2 DAYS=0x9 MNTH=0x4 YEAR=0x7e5
kfdhdb.crestmp.lo:            477378560 ; 0x0ac: USEC=0x0 MSEC=0x10e SECS=0x7 MINS=0x7
kfdhdb.mntstmp.hi:             33116450 ; 0x0b0: HOUR=0x2 DAYS=0x9 MNTH=0x4 YEAR=0x7e5
kfdhdb.mntstmp.lo:            486256640 ; 0x0b4: USEC=0x0 MSEC=0x2ec SECS=0xf MINS=0x7
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200
kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000
kfdhdb.ausize:                  4194304 ; 0x0bc: 0x00400000
kfdhdb.mfact:                    454272 ; 0x0c0: 0x0006ee80
kfdhdb.dsksize:                  429153 ; 0x0c4: 0x00068c61
kfdhdb.pmcnt:                         2 ; 0x0c8: 0x00000002

顺利找到了data5中的第一块磁盘,而且确定了起始位置,然后构造相关的dd语句把分区的数据dd到一个新磁盘中

dd if=/dev/mapper/36000d31003d39e000000000000000004 bs=4M skip=xxxxx count=429153 of=/dev/sdu

然后通过kfed查看数据
20210704160347


通过类似方法依次处理,最终把5块asm disk全部找到,并且顺利dd到新的磁盘中.尝试启动crs,并mount data5
20210704153634

20210704153548

data5 磁盘组mount成功之后,数据库顺利启动,实现lun中删除分区之后,asm磁盘组数据完美恢复
20210704153728

这次运气还不错,仅仅是对lun的分区使用了parted进行了删除和创建等操作,没有格式化文件系统和做成新的asm disk,不然数据会有一部分丢失.对于有部分破坏的分区,需要通过底层碎片的方法进行最大限度抢救数据.参考类似文档:
asm disk被加入vg恢复
又一例asm格式化文件系统恢复
文件系统损坏导致数据文件异常恢复
一次完美的asm disk被格式化ntfs恢复
Oracle 数据文件大小为0kb或者文件丢失恢复
再一起asm disk被格式化成ext3文件系统故障恢复
oracle asm disk格式化恢复—格式化为ext4文件系统
oracle asm disk格式化恢复—格式化为ntfs文件系统
分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例

drop tablesapce 数据恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:drop tablesapce 数据恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户执行了drop tablespace xxx including contents and datafiles,虽然删除命令返回错误,但是该表空间中大量对象被删除,文件没有被从系统层面删除

Tue Jun 29 14:38:26 2021
DROP TABLESPACE "XFF" INCLUDING CONTENTS AND DATAFILES CASCADE CONSTRAINTS
Tue Jun 29 14:38:32 2021
Thread 1 advanced to log sequence 975 (LGWR switch)
  Current log# 3 seq# 975 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\XFF\REDO03.LOG
ORA-604 signalled during: DROP TABLESPACE "XFF" INCLUDING CONTENTS AND DATAFILES CASCADE CONSTRAINTS...
Tue Jun 29 14:40:44 2021
ALTER DATABASE DATAFILE 'D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF'  AUTOEXTEND ON NEXT 50M
Completed: ALTER DATABASE DATAFILE 'D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF'  AUTOEXTEND ON NEXT 50M
ALTER TABLESPACE "XFF" DROP DATAFILE  'D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF02.DBF'
WARNING: Cannot delete file D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF02.DBF
Errors in file d:\app\administrator\diag\rdbms\XFF\XFF\trace\XFF_ora_2528.trc:
ORA-01265: 无法删除 DATA D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF02.DBF
ORA-27056: 无法删除文件
OSD-04024: 无法删除文件。
O/S-Error: (OS 32) 另一个程序正在使用此文件,进程无法访问。
Completed: ALTER TABLESPACE "XFF" DROP DATAFILE  'D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF02.DBF'
Tue Jun 29 14:58:51 2021
ALTER DATABASE DATAFILE 'D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF'  AUTOEXTEND ON NEXT 50M MAXSIZE UNLIMITED
Completed: ALTER DATABASE DATAFILE 'D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF'  AUTOEXTEND ON NEXT 50M MAXSIZE UNLIMITED

通过工具获取obj$字典信息,结合user$,判断业务用户表数据基本上全部删除
20210701193502


对于这类情况,常规方法无法恢复,只能按照表被drop的思路进行恢复.参考文章:dul恢复drop表测试,通过结合客户提供的表结构和一些特殊方法恢复出来所有对象的obj#,dataobj#信息,快速的恢复了客户数据,得到客户认可
20210701194153

ORA-15335: ASM metadata corruption detected in disk group ‘DATA’

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-15335: ASM metadata corruption detected in disk group ‘DATA’

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

asm磁盘组增加磁盘进行扩容之后报ORA-15335: ASM metadata corruption detected in disk group ‘DATA’和ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479],磁盘组dismount,然后mount之后立马dismount掉.

Tue Jun 29 09:19:09 2021
SQL> ALTER DISKGROUP DATA ADD  DISK '/dev/raw/raw5' SIZE 102400M /* ASMCA */ 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (2,1) to disk (/dev/raw/raw5)
NOTE: requesting all-instance membership refresh for group=2
NOTE: initializing header on grp 2 disk DATA_0001
NOTE: requesting all-instance disk validation for group=2
Tue Jun 29 09:19:11 2021
NOTE: skipping rediscovery for group 2/0xb0c845ce (DATA) on local instance.
NOTE: requesting all-instance disk validation for group=2
NOTE: skipping rediscovery for group 2/0xb0c845ce (DATA) on local instance.
NOTE: initiating PST update: grp = 2
Tue Jun 29 09:19:16 2021
GMON updating group 2 at 7 for pid 27, osid 25020
NOTE: PST update grp = 2 completed successfully 
NOTE: membership refresh pending for group 2/0xb0c845ce (DATA)
GMON querying group 2 at 8 for pid 18, osid 3852
NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/raw/raw5
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
GMON querying group 2 at 9 for pid 18, osid 3852
SUCCESS: refreshed membership for 2/0xb0c845ce (DATA)
Tue Jun 29 09:19:20 2021
SUCCESS: ALTER DISKGROUP DATA ADD  DISK '/dev/raw/raw5' SIZE 102400M /* ASMCA */
NOTE: starting rebalance of group 2/0xb0c845ce (DATA) at power 1
Starting background process ARB0
Tue Jun 29 09:19:21 2021
ARB0 started with pid=33, OS id=25176 
NOTE: assigning ARB0 to group 2/0xb0c845ce (DATA) with 1 parallel I/O
cellip.ora not found.
Tue Jun 29 09:19:24 2021
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Tue Jun 29 09:19:46 2021
WARNING: cache read  a corrupt block: group=2(DATA) dsk=0 blk=7 disk=0 (DATA_0000) incarn=3915953476 au=0 blk=7 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ERROR: cache failed to read group=2(DATA) dsk=0 blk=7 from disk(s): 0(DATA_0000)
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
NOTE: cache initiating offline of disk 0 group DATA
NOTE: process _arb0_+asm1 (25176) initiating offline of disk 0.3915953476 (DATA_0000) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 0/0xe968b544, mask = 0x6a, op = clear
Tue Jun 29 09:19:46 2021
GMON updating disk modes for group 2 at 10 for pid 33, osid 25176
ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Tue Jun 29 09:19:46 2021
NOTE: cache dismounting (not clean) group 2/0xB0C845CE (DATA) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 25395, image: oracle@frsrac1 (B000)
Tue Jun 29 09:19:46 2021
NOTE: halting all I/Os to diskgroup 2 (DATA)
Tue Jun 29 09:19:46 2021
NOTE: LGWR doing non-clean dismount of group 2 (DATA)
NOTE: LGWR sync ABA=11.10715 last written ABA 11.10715
WARNING: Offline for disk DATA_0000 in mode 0x7f failed.
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc  (incident=54665):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0000" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_54665/+ASM1_arb0_25176_i54665.trc
Tue Jun 29 09:19:46 2021
kjbdomdet send to inst 2
detach from dom 2, sending detach message to inst 2
Tue Jun 29 09:19:46 2021
List of instances:
 1 2
Dirty detach reconfiguration started (new ddet inc 1, cluster inc 24)
 Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 2 invalid = TRUE 
 796 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
Tue Jun 29 09:19:46 2021
WARNING: dirty detached from domain 2
NOTE: cache dismounted group 2/0xB0C845CE (DATA) 
SQL> alter diskgroup DATA dismount force /* ASM SERVER:2965915086 */ 
Tue Jun 29 09:19:47 2021
ERROR: ORA-15130 thrown in ARB0 for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0000" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
Tue Jun 29 09:19:47 2021
NOTE: stopping process ARB0
Tue Jun 29 09:19:47 2021
Sweep [inc][54665]: completed
Tue Jun 29 09:19:47 2021
Sweep [inc2][54665]: completed
NOTE: cache deleting context for group DATA 2/0xb0c845ce
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_3852.trc:
ORA-15130: diskgroup "DATA" is being dismounted
GMON dismounting group 2 at 11 for pid 27, osid 25395
NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment
SUCCESS: diskgroup DATA was dismounted
SUCCESS: alter diskgroup DATA dismount force /* ASM SERVER:2965915086 */

通过kfed分析报错block,确认错误

kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                       7 ; 0x004: blk=7
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  2183628676 ; 0x00c: 0x82278784 <<======该值错误,应该为:686982479
kfbh.fcn.base:                     3430 ; 0x010: 0x00000d66
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdatb.aunum:                      2240 ; 0x000: 0x000008c0
kfdatb.shrink:                      448 ; 0x004: 0x01c0
kfdatb.ub2pad:                        0 ; 0x006: 0x0000

通过修复该错误,并且禁止reblance操作[增加磁盘数据需要重新分布],mount磁盘组,然后open库,发现redo已经被覆盖(非归档),强制打开库报错

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [0], [2691201882], [0],
[2691227745], [12583040], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [2662], [0], [2691201881], [0],
[2691227745], [12583040], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [0], [2691201879], [0],
[2691227745], [12583040], [], [], [], [], [], []
Process ID: 25110
Session ID: 287 Serial number: 3

通过对scn进行处理,数据库顺利open

SQL> startup mount pfile='/tmp/pfile';
ORACLE instance started.

Total System Global Area 5044088832 bytes
Fixed Size		    2261928 bytes
Variable Size		 1442843736 bytes
Database Buffers	 3590324224 bytes
Redo Buffers		    8658944 bytes
Database mounted.
SQL> alter database open;

Database altered.

/u01空间100%导致数据库报ORA-01114 ORA-29701

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:/u01空间100%导致数据库报ORA-01114 ORA-29701

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库操作报ORA-01114 ORA-29701错误
ORA-01114 ORA-29704


通过分析发现磁盘/u01分区使用100%
20210629191917

sqlplus / as sysdba 无法登陆
20210629191935

清理trace,释放/u01空间之后,数据库恢复正常,mos中关于ORA-01114错误描述

APPLIES TO:
BI Publisher (formerly XML Publisher) - Version 11.1.1.7.x and later
Information in this document applies to any platform.

SYMPTOMS
 While generating reports the following error occurred.

oracle.xdo.XDOException: java.sql.SQLException: ORA-01114: IO error writing block to file (block # ) 
ORA-01114: IO error writing block to file 201 (block # 755561) 
ORA-27072: File I/O error Additional information: 4 
Additional information: 755561 Additional information: 36864

CHANGES
 

CAUSE
 The issue is caused due to DB not having enough space.

SOLUTION
Cleared space in DB and the report worked fine.

lvm缩小xfs文件系统空间和对swap进行扩容操作

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:lvm缩小xfs文件系统空间和对swap进行扩容操作

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

xfs文件系统lvm缩小空间操作(/home从100G减小到80G)

[root@xifenfei ~]# df -h
Filesystem             Size  Used Avail Use% Mounted on
/dev/mapper/rhel-root  449G  6.0G  443G   2% /
devtmpfs                63G     0   63G   0% /dev
tmpfs                   63G     0   63G   0% /dev/shm
tmpfs                   63G   20M   63G   1% /run
tmpfs                   63G     0   63G   0% /sys/fs/cgroup
/dev/mapper/rhel-home  100G   38M  100G   1% /home
/dev/sda2             1014M  165M  850M  17% /boot
/dev/sda1              200M  9.8M  191M   5% /boot/efi
tmpfs                   13G  4.0K   13G   1% /run/user/42
tmpfs                   13G   32K   13G   1% /run/user/0
/dev/sr0               4.2G  4.2G     0 100% /media

[root@xifenfei u01]# xfsdump -f /home.xfsdump /home
xfsdump: using file dump (drive_simple) strategy
xfsdump: version 3.1.7 (dump format 3.0) - type ^C for status and control

 ============================= dump label dialog ==============================

please enter label for this dump session (timeout in 300 sec)
 -> home
session label entered: "tar czvf /home.tar.gz /home
home"

 --------------------------------- end dialog ---------------------------------

xfsdump: level 0 dump of xifenfei:/home
xfsdump: dump date: Fri Jun 25 11:37:13 2021
xfsdump: session id: 4d75008e-9927-417d-9722-52d13bb89eb0
xfsdump: session label: 
xfsdump: ino map phase 1: constructing initial dump list
xfsdump: ino map phase 2: skipping (no pruning necessary)
xfsdump: ino map phase 3: skipping (only one dump stream)
xfsdump: ino map construction complete
xfsdump: estimated dump size: 4828224 bytes
xfsdump: /var/lib/xfsdump/inventory created

 ============================= media label dialog =============================

please enter label for media in drive 0 (timeout in 300 sec)
 -> home
media label entered: "home"

 --------------------------------- end dialog ---------------------------------

xfsdump: creating dump session media file 0 (media 0, file 0)
xfsdump: dumping ino map
xfsdump: dumping directories
xfsdump: dumping non-directory files
xfsdump: ending media file
xfsdump: media file size 4732672 bytes
xfsdump: dump size (non-dir files) : 4588480 bytes
xfsdump: dump complete: 4 seconds elapsed
xfsdump: Dump Summary:
xfsdump:   stream 0 /home.xfsdump OK (success)
xfsdump: Dump Status: SUCCESS

[root@xifenfei u01]# umount /home
[root@xifenfei u01]# lvreduce -L 80G /dev/mapper/rhel-home
  WARNING: Reducing active logical volume to 80.00 GiB.
  THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce rhel/home? [y/n]: y
  Size of logical volume rhel/home changed from 100.00 GiB (25600 extents) to 80.00 GiB (20480 extents).
  Logical volume rhel/home successfully resized.

[root@xifenfei u01]# mkfs.xfs -f /dev/mapper/rhel-home
meta-data=/dev/mapper/rhel-home  isize=512    agcount=16, agsize=1310720 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0, sparse=0
data     =                       bsize=4096   blocks=20971520, imaxpct=25
         =                       sunit=64     swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=10240, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@xifenfei u01]# mount /home
xfsrestore -f /home.xfsdump /home
[root@xifenfei u01]# xfsrestore -f /home.xfsdump /home
xfsrestore: using file dump (drive_simple) strategy
xfsrestore: version 3.1.7 (dump format 3.0) - type ^C for status and control
xfsrestore: searching media for dump
xfsrestore: examining media file 0
xfsrestore: dump description: 
xfsrestore: hostname: xifenfei
xfsrestore: mount point: /home
xfsrestore: volume: /dev/mapper/rhel-home
xfsrestore: session time: Fri Jun 25 11:37:13 2021
xfsrestore: level: 0
xfsrestore: session label: "tar czvf /home.tar.gz /home
home"
xfsrestore: media label: "home"
xfsrestore: file system id: b996cff9-332b-4c07-96e1-8335a1f23627
xfsrestore: session id: 4d75008e-9927-417d-9722-52d13bb89eb0
xfsrestore: media id: 6094b9b5-a45f-4638-a0e2-c1b982ead67b
xfsrestore: using online session inventory
xfsrestore: searching media for directory dump
xfsrestore: reading directories
xfsrestore: 119 directories and 188 entries processed
xfsrestore: directory post-processing
xfsrestore: restoring non-directory files
xfsrestore: restore complete: 0 seconds elapsed
xfsrestore: Restore Summary:
xfsrestore:   stream 0 /home.xfsdump OK (success)
xfsrestore: Restore Status: SUCCESS
[root@xifenfei u01]# df -h
Filesystem             Size  Used Avail Use% Mounted on
/dev/mapper/rhel-root  449G   14G  435G   4% /
devtmpfs                63G     0   63G   0% /dev
tmpfs                   63G   20M   63G   1% /run
tmpfs                   63G     0   63G   0% /sys/fs/cgroup
/dev/sda2             1014M  165M  850M  17% /boot
/dev/sda1              200M  9.8M  191M   5% /boot/efi
tmpfs                   13G  4.0K   13G   1% /run/user/42
tmpfs                   13G   28K   13G   1% /run/user/0
/dev/sr0               4.2G  4.2G     0 100% /media
tmpfs                   63G     0   63G   0% /dev/shm
/dev/mapper/rhel-home   80G   38M   80G   1% /home

xfs系统的lvm无法直接缩小空间,只能是通过xfsdump /home内容,然后lvm缩小空间重做xfs文件系统,再使用xfsdump还原

lvm扩容swap空间(swap从8G扩大到16G)

[root@xifenfei home]# free -m
              total        used        free      shared  buff/cache   available
Mem:         128355       86907       26110         274       15338       37632
Swap:         8192           0        8192
[root@xifenfei home]# lvextend -L 16GB /dev/rhel/swap
  Size of logical volume rhel/swap changed from 8.00 GiB (2048 extents) to 16.00 GiB (4096 extents).
  Logical volume rhel/swap successfully resized.
[root@xifenfei home]# sync;sync
[root@xifenfei home]# swapoff /dev/rhel/swap
mkswap /dev/rhel/swap 
[root@xifenfei home]# mkswap /dev/rhel/swap 
mkswap: /dev/rhel/swap: warning: wiping old swap signature.
swapon /dev/rhel/swap Setting up swapspace version 1, size = 16777212 KiB
no label, UUID=8d79ccf4-1796-49c9-968d-23abb67bc6eb
[root@xifenfei home]# swapon /dev/rhel/swap 
[root@xifenfei home]# free -m
              total        used        free      shared  buff/cache   available
Mem:         128355       86907       26110         274       15338       37632
Swap:         16383           0       16383

expdp 并行导出单表数据

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:expdp 并行导出单表数据

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在某些情况下,需要使用并行的方法使用 datapump 对单个对象并行导出,导入加快数据迁移的数据
expdp导出操作

#!/bin/bash
chunk=10
for ((i=0;i<chunk;i++));
do
  expdp USERNAME/Password@DB_NAME TABLES=LOB_TEST QUERY=LOB_TEST:\"where mod\(dbms_rowid.rowid_block_number\(rowid\)\, 
>${chunk}\) = ${i}\" directory=DMP dumpfile=lob_test_${i}.dmp logfile= log_test_${i}.log &
  echo $i
done 

impdp导入操作

#!/bin/bash
chunk=10
for ((i=0;i<chunk;i++));
do
 impdp USERNAME/Password@DB_NAME  directory=DMP REMAP_TABLE=LOB_TEST:LOB_TEST  remap_schema=source:target 
>dumpfile= lob_test_${i}.dmp logfile=TABLE_imp_log_test_${i}.log  DATA_OPTIONS=DISABLE_APPEND_HINT  CONTENT=DATA_ONLY &
 echo $i
done

在12c版本开始impdp可能会启用ENABLE_PARALLEL_DML特性,需要注意
参考:Optimising LOB Export and Import Performance via Oracle DataPump

datapump network_link遭遇ORA-12899错误

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:datapump network_link遭遇ORA-12899错误

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在给一个客户使用expdp+network_link导出数据,然后通过impdp导入数据的过程中遇到ORA-12899问题.
20210608215836


对原库和现在库进行分析
20210608215825
20210608215817

原库和目标库表结构一致,原库该表存储数据实际长度确实为1,但是在impdp导入的时候提示需要长度为3.通过分析,确认原库的nls_length_semantics参数设置为char了,直接使用impdp+network_link不落地方式导入该表数据成功
20210608215845

根据上述情况,查询相关文档,确认类似记录为:
ORA-12899 When Using IMPDP Over Network Link (Doc ID 414901.1)
ORA-26059 During Impdp Using Export Dump Taken With NETWORK_LINK Option (Doc ID 2266956.1)
虽然都不是完全匹配该问题,但是基本上可以确认expdp的network_link和nls_length_semantics参数是引起该问题的根本原因,在后续的迁移中,尽量保持nls_length_semantics参数一致.