诡异dblink问题解决

一、诡异dblink起源
今天开发拿过来一条sql,说有诡异现象,sql如下

INSERT INTO TAB_CS_CALLLOG
select c.user_logon,/*工号*/c.user_name,/*姓名*/a.call_id,/*通话id*/
      a.caller_nbr,/*主叫号码*/a.called_nbr,/*被叫号码*/d.start_time,/*呼入时间*/
      b.call_time,/*接听时间*/b.end_time,/*结束时间*/b.call_dura,/*时长*/
      to_number(b.call_time-d.start_time)*24*3600,/*等待时长*/
      decode(c.user_logon,null,0,1),/*类型*/
case
        when substr (a.CALLED_NBR,7, 2) = '00'
          then '广东移动'
        when substr (a.CALLED_NBR,7, 2) = '01'
          then '浙江移动'
        when substr (a.CALLED_NBR,7, 2) = '02'
          then '福建'
        when substr (a.CALLED_NBR,7, 2) = '03'
          then '四川'
        when substr (a.CALLED_NBR,7, 2) = '04'
          then '河南'
        when substr (a.CALLED_NBR,7, 2) = '05'
          then '湖北'
        when substr (a.CALLED_NBR,7, 2) = '06'
          then '北京CSIP'
        when substr (a.CALLED_NBR,7, 2) = '07'
          then '陕西'
        when substr (a.CALLED_NBR,7, 2) = '08'
          then '吉林'
        when substr (a.CALLED_NBR,7, 2) = '09'
          then '江西'
        when substr (a.CALLED_NBR,7, 2) = '10'
          then '宁夏'
        when substr (a.CALLED_NBR,7, 2) = '11'
          then '太原'
        when substr (a.CALLED_NBR,7, 2) = '12'
          then '江苏移动'
        when substr (a.CALLED_NBR,7, 2) = '13'
          then 'e掌管'
        when substr (a.CALLED_NBR,7, 2) = '15'
          then 'e多商'
        when substr (a.CALLED_NBR,7, 2) = '18'
          then '江苏联通'
      end 区域
from cscnew.a@cs a,cscnew.b@cs b,cscnew.c@cs c,cscnew.d@cs d
where a.call_serial=b.call_serial(+)
and b.call_serial=d.call_serial(+)
and b.user_id=c.user_id(+)
and substr(a.CALLED_NBR,1, 6) = '951654'
and  length (a.CALLED_NBR) = 15
and b.fail_reason is null
and a.end_time>=to_date('20110822000000','yyyymmddhh24miss')
and a.end_time<to_date('20110823000000','yyyymmddhh24miss');

然后我进行了测试,证实了她所说的诡异:
1、直接执行select语句需要1.7S左右,但是加上insert inot后,执行时间需要6分钟
2、直接select结果集为602条,加上insert into后,结果集为598条(少4条),如果直接执行select,除掉and b.fail_reason is null限制条件也刚好602条
二、查询相关资料,得到dblink的一些解释
1、dblink执行有两种方式,一种是在远处数据库执行完,然后结果返回,另一种是把远程的表下载到本来,然后执行
2、如果把远程的表下载到本地,空值或者null可能会发生变化(怀疑是空值转化为null,未证实)
三、根据这些解释,进行猜想
1、只执行select的时候,应该是在远程执行完,传输结果回来;而执行insert into的时候,是把远程的表全部下载到本地,然后执行出结果,而数据量本身比较大,所以比较慢
2、在把表从远程下载到本地的过程中,fail_reason 列的null值可以发生了变化,或者空值变为了null,所以数据多了4条
四、事实证明猜想
1、查询远程表大小,发现a表50m,b表400m,c表10m,d表100m左右,传输过来需要一定的时间
2、既然猜测是由于要把表传输到本地而导致这样的结果产生,那么处理方法就是让程序在远程计算出结果,然后传输到本地,查询了一些资料,上面说insert into会导致driving_site提示无效,那么我想到一个用视图的办法解决这个问题:在目标端建立一个关于本查询中无参数的视图,然后在本地通过dblink调用视图,这样总该先在远程执行出结果传输到本地了吧。
2.1)建立目标端视图

create or replace view v_tab
select c.user_logon,/*工号*/c.user_name,/*姓名*/a.call_id,/*通话id*/
      a.caller_nbr,/*主叫号码*/a.called_nbr,/*被叫号码*/d.start_time,/*呼入时间*/
      b.call_time,/*接听时间*/b.end_time,/*结束时间*/b.call_dura,/*时长*/
      to_number(b.call_time-d.start_time)*24*3600,/*等待时长*/
      decode(c.user_logon,null,0,1),/*类型*/
case
        when substr (a.CALLED_NBR,7, 2) = '00'
          then '广东移动'
        when substr (a.CALLED_NBR,7, 2) = '01'
          then '浙江移动'
        when substr (a.CALLED_NBR,7, 2) = '02'
          then '福建'
        when substr (a.CALLED_NBR,7, 2) = '03'
          then '四川'
        when substr (a.CALLED_NBR,7, 2) = '04'
          then '河南'
        when substr (a.CALLED_NBR,7, 2) = '05'
          then '湖北'
        when substr (a.CALLED_NBR,7, 2) = '06'
          then '北京CSIP'
        when substr (a.CALLED_NBR,7, 2) = '07'
          then '陕西'
        when substr (a.CALLED_NBR,7, 2) = '08'
          then '吉林'
        when substr (a.CALLED_NBR,7, 2) = '09'
          then '江西'
        when substr (a.CALLED_NBR,7, 2) = '10'
          then '宁夏'
        when substr (a.CALLED_NBR,7, 2) = '11'
          then '太原'
        when substr (a.CALLED_NBR,7, 2) = '12'
          then '江苏移动'
        when substr (a.CALLED_NBR,7, 2) = '13'
          then 'e掌管'
        when substr (a.CALLED_NBR,7, 2) = '15'
          then 'e多商'
        when substr (a.CALLED_NBR,7, 2) = '18'
          then '江苏联通'
      end 区域
from cscnew.a a,cscnew.b b,cscnew.c c,cscnew.d d
where a.call_serial=b.call_serial(+)
and b.call_serial=d.call_serial(+)
and b.user_id=c.user_id(+)
and substr(a.CALLED_NBR,1, 6) = '951654'
and  length (a.CALLED_NBR) = 15
and b.fail_reason is null

2.2)本地调用远程视图

INSERT INTO TAB_CS_CALLLOG
select * from v_tab@cs a where
a.end_time>=to_date('20110822000000','yyyymmddhh24miss')
and a.end_time<to_date('20110823000000','yyyymmddhh24miss');

2.3)执行结果2.3S完成数据插入,而且条数也是598条,证明我的猜想是正确的,更重要的是解决了今天这个让人疑惑的问题

undo异常处理步骤(10g)

1、启动数据库,发现错误
startup
2、查看是否是undo文件损坏引起
select a.ts#,a.name,b.name from v$datafile a,v$tablespace b where a.ts#=b.ts#;
3、损坏undo离线
alter database datafile n offline drop;
4、打开数据库
alter database open;
5、查看当前undo表空间,并确认是否是当前undo损坏,如果不是当前undo,直接执行8,否则继续
show parameter undo_tablespace;
6、创建新undo表空间
CREATE UNDO TABLESPACE UNDOTBSNEW DATAFILE
‘/opt/oracle/oradata/xifenfei/UNDOTBS01.dbf
SIZE 50M autoextend on next 10m maxsize 30G;
7、修改默认undo为新创建表空间
alter system set undo_tablespace=’UNDOTBSNEW’;
8、查询回滚段信息,为下面删除损坏undo的回滚段做到心中有底
select segment_name,status from dba_rollback_segs;
9、删除损坏undo,如果报ORA-01548错误,继续下面操作,否则跳到14
drop tablespace UNDOTBSOLD including contents and datafiles;
10、创建pfile文件
create pfile=’/tmp/pfile’ from spfile;
11、修改pfile文件,添加隐含参数
*._allow_resetlogs_corruption=true
*._corrupted_rollback_segments=(_SYSSMU1$,_SYSSMU2$,_SYSSMU3$,_SYSSMU4$,_SYSSMU5$,
_SYSSMU6$,_SYSSMU7$,_SYSSMU8$,_SYSSMU9$,_SYSSMU10$)
12、使用pfile启动数据库
startup pfile=’/tmp/pfile’
13、删除损坏undo
drop tablespace UNDOTBSOLD including contents and datafiles;
14、查看回滚段状态
select segment_name,status from dba_rollback_segs;
15、如果有损坏表空间回滚段还存在,手工删除
drop rollback segment “_SYSSMUx$”;
16、重启数据库
shutdown immediate
startup

Linux修改系统时间

我们一般使用’date -s’命令来修改系统时间。比如将系统时间设定成2011年8月24日的命令如下。
  #date -s 08/24/2011
将系统时间设定成下午23点0分20秒的命令如下。
  #date -s 23:00:20  
注意,这里说的是系统时间,是linux由操作系统维护的。  
在系统启动时,Linux操作系统将时间从CMOS中读到系统时间变量中,以后修改时间通过修改系统时间实现。为了保持系统时间与CMOS时间的一致性,Linux每隔一段时间会将系统时间写入CMOS。由于该同步是每隔一段时间(大约是11分钟)进行的,在我们执行date -s后,如果马上重起机器,修改时间就有可能没有被写入CMOS,这就是问题的原因。如果要确保修改生效可以执行如下命令。  
  #clock -w  
这个命令强制把系统时间写入CMOS。

当前联机日志损坏恢复

环境模拟
删除状态为active的联机日志,然后强行关闭数据库
处理过程
SQL> startup
ORACLE instance started.
Total System Global Area 167772160 bytes
Fixed Size 1260720 bytes
Variable Size 142607184 bytes
Database Buffers 16777216 bytes
Redo Buffers 7127040 bytes
Database mounted.
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: ‘/u01/oradata/xienfei/redo01.log’
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
Alert.log 日志中错误
Wed Aug 24 00:26:33 2011
Errors in file /u01/admin/xienfei/udump/xff_ora_9186.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: ‘/u01/oradata/xienfei/redo01.log’
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
xff_ora_9186.trc文件中错误
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: ‘/u01/oradata/xienfei/redo01.log’
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
查询当前日志组状态
SQL> select a.group#,a.status,b.member from v$log a,v$logfile b where a.group#=b.group#;
GROUP# STATUS MEMBER
———- —————- ———————————————
1 CURRENT /u01/oradata/xienfei/redo01.log
3 INACTIVE /u01/oradata/xienfei/redo03.log
2 ACTIVE /u01/oradata/xienfei/redo02.log
尝试删除redo日志
SQL> alter database drop logfile group 1;
alter database drop logfile group 1
*
ERROR at line 1:
ORA-01623: log 1 is current log for instance xff (thread 1) – cannot drop
ORA-00312: online log 1 thread 1: ‘/u01/oradata/xienfei/redo01.log’
发现是当前日志不能被删除,尝试切换日志
SQL> alter system switch logfile;
alter system switch logfile
*
ERROR at line 1:
ORA-01109: database not open
在数据库未打开状态,不能切换日志,只能尝试清空日志
SQL> alter database clear unarchived logfile group 1;
alter database clear unarchived logfile group 1
*
ERROR at line 1:
ORA-01624: log 1 needed for crash recovery of instance xff (thread 1)
ORA-00312: online log 1 thread 1: ‘/u01/oradata/xienfei/redo01.log’
因为数据库恢复需要使用,不能被清空,尝试不完成恢复
SQL> recover database until cancel;
ORA-00279: change 1272687 generated at 08/24/2011 00:20:05 needed for thread 1
ORA-00289: suggestion : /u01/archive/1_27_756841839.arc
ORA-00280: change 1272687 for thread 1 is in sequence #27
Specify log: {=suggested | filename | AUTO | CANCEL}
auto
ORA-00279: change 1272903 generated at 08/24/2011 00:25:17 needed for thread 1
ORA-00289: suggestion : /u01/archive/1_28_756841839.arc
ORA-00280: change 1272903 for thread 1 is in sequence #28
ORA-00278: log file ‘/u01/archive/1_27_756841839.arc’ no longer needed for this
recovery
ORA-00308: cannot open archived log ‘/u01/archive/1_28_756841839.arc’
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: ‘/u01/oradata/xienfei/system01.dbf’
已经提示数据不一致,尝试着打开数据库
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: ‘/u01/oradata/xienfei/system01.dbf’
创建pfile文件,添加隐含参数,使之不进行检查点一致性校验
SQL> create pfile=’/tmp/pfile’ from spfile;
File created.
修改pfile ,添加以下参数
*._allow_resetlogs_corruption=TRUE
*._allow_error_simulation=TRUE
SQL> shutdown abort
ORACLE instance shut down.
使用pfile打开数据库
SQL> startup pfile=’/tmp/pfile’
ORACLE instance started.
Total System Global Area 167772160 bytes
Fixed Size 1260720 bytes
Variable Size 150995792 bytes
Database Buffers 8388608 bytes
Redo Buffers 7127040 bytes
Database mounted.
ORA-38760: This database instance failed to turn on flashback database
发现flashback导致数据库不能被正常打开,尝试关闭它
SQL> alter database flashback off;
Database altered.
尝试直接open数据库
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open
根据提示,使用resetlogs打开数据库
SQL> alter database open resetlogs;
Database altered.
查询日志状态
SQL> select a.group#,a.status,b.member from v$log a,v$logfile b where a.group#=b.group#;
GROUP# STATUS MEMBER
———- —————- ———————————————
3 UNUSED /u01/oradata/xienfei/redo03.log
2 UNUSED /u01/oradata/xienfei/redo02.log
1 CURRENT /u01/oradata/xienfei/redo01.log
因为group1错误,而当前日志组在group 1上,所以切换日志组
SQL> alter system switch logfile;
System altered.
SQL> select a.group#,a.status,b.member from v$log a,v$logfile b where a.group#=b.group#;
GROUP# STATUS MEMBER
———- —————- ———————————————
3 UNUSED /u01/oradata/xienfei/redo03.log
2 CURRENT /u01/oradata/xienfei/redo02.log
1 ACTIVE /u01/oradata/xienfei/redo01.log
SQL> alter system checkpoint;
System altered.
SQL> select a.group#,a.status,b.member from v$log a,v$logfile b where a.group#=b.group#;
GROUP# STATUS MEMBER
———- —————- ———————————————
3 UNUSED /u01/oradata/xienfei/redo03.log
2 CURRENT /u01/oradata/xienfei/redo02.log
1 INACTIVE /u01/oradata/xienfei/redo01.log
删除有问题的group 1日志组
SQL> alter database drop logfile group 1;
Database altered.
SQL> alter system switch logfile;
System altered.
添加日志组并检查是否正确
SQL> select a.group#,a.status,b.member from v$log a,v$logfile b where a.group#=b.group#;
GROUP# STATUS MEMBER
———- —————- ———————————————
3 CURRENT /u01/oradata/xienfei/redo03.log
2 ACTIVE /u01/oradata/xienfei/redo02.log
SQL> alter database add logfile group 1 ‘/u01/oradata/xienfei/redo01.log’ size 50m reuse;
Database altered.
SQL> alter system switch logfile;
System altered.
SQL> select a.group#,a.status,b.member from v$log a,v$logfile b where a.group#=b.group#;
GROUP# STATUS MEMBER
———- —————- ———————————————
3 ACTIVE /u01/oradata/xienfei/redo03.log
2 ACTIVE /u01/oradata/xienfei/redo02.log
1 CURRENT /u01/oradata/xienfei/redo01.log
注意:根据oracle官方建议,使用oracle隐含参数运行数据库可能存在很多不稳定因素,建议立即导出数据库数据,然后新建库,重新导入数据

数据库open状态下日志异常处理

一、环境模拟
删除INACTIVE状态redo日志的物理文件,然后进行大批量事务操作

二、出现现象
1、alert.log记录
Tue Aug 23 23:32:02 2011
Errors in file /u01/admin/xienfei/bdump/xff_arc1_8773.trc:
ORA-00313: Message 313 not found; No message file for product=RDBMS, facility=ORA; arguments: [3] [1]
ORA-00312: Message 312 not found; No message file for product=RDBMS, facility=ORA; arguments: [3] [1] [/u01/oradata/xienfei/redo03.log]
ORA-27037: Message 27037 not found; No message file for product=RDBMS, facility=ORA
Linux Error: 2: No such file or directory
Additional information: 3

2、xff_arc1_8773.trc文件中内容
*** 2011-07-18 18:35:32.071 59526 kcrr.c
kcrrfail: dest:2 err:12541 force:0 blast:1
ORA-00313: Message 313 not found; No message file for product=RDBMS, facility=ORA; arguments: [3] [1]
ORA-00312: Message 312 not found; No message file for product=RDBMS, facility=ORA; arguments: [3] [1] [/u01/oradata/xienfei/redo03.log]
ORA-27037: Message 27037 not found; No message file for product=RDBMS, facility=ORA
Linux Error: 2: No such file or directory

三、处理过程
1、由alert中的redo文件路径,查询group#
SQL> select group#,member from v$logfile;
GROUP# MEMBER
———- ———————————————
3 /u01/oradata/xienfei/redo03.log
2 /u01/oradata/xienfei/redo02.log
1 /u01/oradata/xienfei/redo01.log
SQL> select group#,sequence#,status from v$log;
GROUP#  SEQUENCE# STATUS
———- ———- —————-
1         16 ACTIVE
2         17 CURRENT
3         15 INACTIVE
2、确定是inactive状态的redo日志,直接删除该日志组
SQL> alter database drop logfile group 3;
alter database drop logfile group 3
*
ERROR at line 1:
ORA-00350: log 3 of instance xff (thread 1) needs to be archived
ORA-00312: online log 3 thread 1: ‘/u01/oradata/xienfei/redo03.log’
3、删除失败,提示该日志需要归档,然后直接强行执行清空redo操作
SQL> alter   database   clear   logfile   group   3;
alter   database   clear   logfile   group   3
*
ERROR at line 1:
ORA-00350: log 3 of instance xff (thread 1) needs to be archived
ORA-00312: online log 3 thread 1: ‘/u01/oradata/xienfei/redo03.log’
4、还不行,加大力度,直接清空并不归档
SQL> ALTER   DATABASE   CLEAR   UNARCHIVED   LOGFILE   GROUP   3;
Database altered.
5、再删除group 3
SQL> alter database drop logfile group 3;
alter database drop logfile group 3
*
ERROR at line 1:
ORA-01623: log 3 is current log for instance xff (thread 1) – cannot drop
ORA-00312: online log 3 thread 1: ‘/u01/oradata/xienfei/redo03.log’
6、还是失败,根据提示查看该日志组当前状态
SQL> select group#,sequence#,status from v$log;
GROUP#  SEQUENCE# STATUS
———- ———- —————-
1         16 INACTIVE
2         17 ACTIVE
3         18 CURRENT
7、切换日志组
SQL> alter system switch logfile;
System altered.
8、查看状态,并处理置于INACTIVE状态
SQL> select group#,sequence#,status from v$log;
GROUP#  SEQUENCE# STATUS
———- ———- —————-
1         19 CURRENT
2         17 ACTIVE
3         18 ACTIVE
SQL> alter system checkpoint;
System altered.
SQL> select group#,sequence#,status from v$log;
GROUP#  SEQUENCE# STATUS
———- ———- —————-
1         19 CURRENT
2         17 INACTIVE
3         18 INACTIVE
9、再次删除归档group 3
SQL> alter database drop logfile group 3;
alter database drop logfile group 3
*
ERROR at line 1:
ORA-00350: log 3 of instance xff (thread 1) needs to be archived
ORA-00312: online log 3 thread 1: ‘/u01/oradata/xienfei/redo03.log’
10、再次清空日志并不归档
SQL> ALTER   DATABASE   CLEAR   UNARCHIVED   LOGFILE   GROUP   3;
Database altered.
11、查询状态
SQL> select group#,sequence#,status from v$log;
GROUP#  SEQUENCE# STATUS
———- ———- —————-
1         19 CURRENT
2         17 INACTIVE
3          0 UNUSED
12、再次删除group 3
SQL> alter database drop logfile group 3;
Database altered.
SQL> select group#,sequence#,status from v$log;
GROUP#  SEQUENCE# STATUS
———- ———- —————-
1         19 CURRENT
2         17 INACTIVE
13、删除丢失的group 3成功,需要添加一组日志
SQL> alter database add logfile group 3 ‘/u01/oradata/xienfei/redo03.log’ size 50m reuse;
Database altered.
14、核实是否正常
SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> /
System altered.
SQL> select group#,sequence#,status from v$log;
GROUP#  SEQUENCE# STATUS
———- ———- —————-
1         22 CURRENT
2         21 ACTIVE
3         20 INACTIVE

ORA-00205问题处理

1、启动数据库异常
SQL> startup
ORACLE instance started.
Total System Global Area 167772160 bytes
Fixed Size 1260720 bytes
Variable Size 142607184 bytes
Database Buffers 16777216 bytes
Redo Buffers 7127040 bytes
ORA-00205: error in identifying control file, check alert log for more info
alert.log日志中记录
ALTER DATABASE MOUNT
Mon Jul 18 17:32:58 2011
ORA-00202: Message 202 not found; No message file for product=RDBMS, facility=ORA; arguments: [/u01/oradata/xienfei/control01.ctl]
ORA-27037: Message 27037 not found; No message file for product=RDBMS, facility=ORA
Linux Error: 2: No such file or directory
2、根据提示缺少控制文件,第一步是看看有没有冗余的控制文件,然后修改pfile或者复制控制文件处理,如果没有利用备份控制文件恢复
RMAN> restore controlfile from ‘/tmp/rman_1kmhorc2_1_1’;
Starting restore at 18-JUL-11
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=156 devtype=DISK
channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:04
output filename=/u01/oradata/xienfei/control01.ctl
Finished restore at 18-JUL-11
3、恢复控制文件后,数据库至于mount状态
RMAN> alter database mount;
database mounted
released channel: ORA_DISK_1
4、添加归档日志到控制文件
RMAN> catalog start with ‘/u01/archive’;
Starting implicit crosscheck backup at 18-JUL-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=156 devtype=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: sid=155 devtype=DISK
Crosschecked 29 objects
Crosschecked 12 objects
Finished implicit crosscheck backup at 18-JUL-11
Starting implicit crosscheck copy at 18-JUL-11
using channel ORA_DISK_1
using channel ORA_DISK_2
Crosschecked 1 objects
Finished implicit crosscheck copy at 18-JUL-11
searching for all files in the recovery area
cataloging files…
no files cataloged
searching for all files that match the pattern /u01/archive
List of Files Unknown to the Database
=====================================
…………
Do you really want to catalog the above files (enter YES or NO)? yes
cataloging files…
cataloging done
List of Cataloged Files
=======================
…………
RMAN> exit
[oracle@node2 tmp]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.3.0 – Production on Mon Jul 18 17:43:47 2011
Copyright (c) 1982, 2006, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 – Production
With the Partitioning, OLAP and Data Mining options
5、使用备份控制文件恢复
SQL> recover database using backup controlfile;
ORA-00283: recovery session canceled due to errors
ORA-01111: name for data file 8 is unknown – rename to correct file
ORA-01110: data file 8: ‘/u01/oracle/dbs/UNNAMED00008’
ORA-01157: cannot identify/lock data file 8 – see DBWR trace file
ORA-01111: name for data file 8 is unknown – rename to correct file
ORA-01110: data file 8: ‘/u01/oracle/dbs/UNNAMED00008’
6、发现控制文件后又新增数据文件,需要重命名控制文件中的新增数据文件
SQL> select file#,name from v$datafile where file#=8;
FILE# NAME
———- ———————————————-
8 /u01/oracle/dbs/UNNAMED00008
SQL> alter database rename file ‘/u01/oracle/dbs/UNNAMED00008’ to ‘/u01/oradata/xienfei/cfxff01.dbf’;
Database altered.
7、继续进行不完成恢复(利用归档日志)
SQL> recover database using backup controlfile;
ORA-00279: change 1158476 generated at 07/18/2011 16:43:27 needed for thread 1
ORA-00289: suggestion : /u01/archive/1_1_756837539.arc
ORA-00280: change 1158476 for thread 1 is in sequence #1
Specify log: {=suggested | filename | AUTO | CANCEL}
auto
ORA-00279: change 1158594 generated at 07/18/2011 16:44:01 needed for thread 1
ORA-00289: suggestion : /u01/archive/1_2_756837539.arc
ORA-00280: change 1158594 for thread 1 is in sequence #2
ORA-00278: log file ‘/u01/archive/1_1_756837539.arc’ no longer needed for this
recovery
…………
ORA-00308: cannot open archived log ‘/u01/archive/1_7_756837539.arc’
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
8、根据提示,应该是归档日志恢复完成,需要利用联机日志继续恢复
SQL> recover database using backup controlfile;
ORA-00279: change 1201601 generated at 07/18/2011 17:29:19 needed for thread 1
ORA-00289: suggestion : /u01/archive/1_7_756837539.arc
ORA-00280: change 1201601 for thread 1 is in sequence #7
Specify log: {=suggested | filename | AUTO | CANCEL}
/u01/oradata/xienfei/redo03.log
ORA-00310: archived log contains sequence 6; sequence 7 required
ORA-00334: archived log: ‘/u01/oradata/xienfei/redo03.log’
SQL> recover database using backup controlfile;
ORA-00279: change 1201601 generated at 07/18/2011 17:29:19 needed for thread 1
ORA-00289: suggestion : /u01/archive/1_7_756837539.arc
ORA-00280: change 1201601 for thread 1 is in sequence #7
Specify log: {=suggested | filename | AUTO | CANCEL}
/u01/oradata/xienfei/redo01.log
Log applied.
Media recovery complete.
9、resetlogs打开数据库
SQL> alter database open resetlogs;
Database altered.

重建控制文件

#  使用alter database backup controlfile to trace;产生控制文件的trace file
-- Below are two sets of SQL statements, each of which creates a new
-- control file and uses it to open the database. The first set opens
-- the database with the NORESETLOGS option and should be used only if
-- the current versions of all online logs are available. The second
-- set opens the database with the RESETLOGS option and should be used
-- if online logs are unavailable.
-- The appropriate set of statements can be copied from the trace into
-- a script file, edited as necessary, and executed when there is a
-- need to re-create the control file.
#  需要重建控制文件的话, 如果online redo 是可用的, 使用NORESETLOGS 创建新控制文件打开数据库
#  如果online redo是不可用的, 那么需要使用RESETLOGS 创建新控制文件打开数据库,
#  联机日志被重置,数据有可能丢失
--
--     Set #1. NORESETLOGS case
--
-- The following commands will create a new control file and use it
-- to open the database.
-- Data used by Recovery Manager will be lost.
-- Additional logs may be required for media recovery of offline
-- Use this only if the current versions of all online logs are
-- available.
-- After mounting the created controlfile, the following SQL
-- statement will place the database in the appropriate
-- protection mode:
--  ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE PERFORMANCE
#   下面的命令用来创建一个新的控制文件来开启数据库。 用于RMAN的数据将会丢失。
#   对于offline datafiles 可能需要额外的归档日志用来做介质恢复,
#   当当前版本的所有online redo log都没有问题的情况下使用noresetlogs。
STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "XIENFEI" NORESETLOGS FORCE LOGGING ARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 2921
LOGFILE
  GROUP 1 '/u01/oradata/xienfei/redo01.log'  SIZE 5M,
  GROUP 2 '/u01/oradata/xienfei/redo02.log'  SIZE 5M,
  GROUP 3 '/u01/oradata/xienfei/redo03.log'  SIZE 5M
-- STANDBY LOGFILE
DATAFILE
  '/u01/oradata/xienfei/system01.dbf',
  '/u01/oradata/xienfei/undotbs01.dbf',
  '/u01/oradata/xienfei/sysaux01.dbf',
  '/u01/oradata/xienfei/users01.dbf',
  '/u01/oradata/xienfei/example01.dbf',
  '/u01/oradata/xienfei/xff01.dbf',
  '/u01/oradata/xienfei/cf01.dbf',
  '/u01/oradata/xienfei/cfxff01.dbf'
CHARACTER SET ZHS16GBK
;
-- Commands to re-create incarnation table
-- Below log names MUST be changed to existing filenames on
-- disk. Any one log file from each branch can be used to
-- re-create incarnation records.
-- ALTER DATABASE REGISTER LOGFILE '/u01/archive/1_1_756837539.arc';
#  有些时候可能需要先注册归档日志,再才能够恢复
-- Recovery is required if any of the datafiles are restored backups,
-- or if the last shutdown was not normal or immediate.
#  恢复数据库
RECOVER DATABASE
-- All logs need archiving and a log switch is needed.
#  归档所有日志
ALTER SYSTEM ARCHIVE LOG ALL;
-- Database can now be opened normally.
#  打开数据库
ALTER DATABASE OPEN;
-- Commands to add tempfiles to temporary tablespaces.
-- Online tempfiles have complete space information.
-- Other tempfiles may require adjustment.
#  添加临时表空间
ALTER TABLESPACE TEMP ADD TEMPFILE '/u01/oradata/xienfei/temp01.dbf'
     SIZE 32505856  REUSE AUTOEXTEND ON NEXT 655360  MAXSIZE 32767M;
-- End of tempfile additions.
##    下面的RESETLOGS 重建语句, online redo log 中的内容都会丢失,所有的备份将无效
##    仅仅在online log 损坏的情况下使用 resetlogs 创建控制文件开启数据库。
##    resetlog创建控制文件后的开启步骤与noresetlog有一些不一样,请注意下面步骤
--     Set #2. RESETLOGS case
--
-- The following commands will create a new control file and use it
-- to open the database.
-- Data used by Recovery Manager will be lost.
-- The contents of online logs will be lost and all backups will
-- be invalidated. Use this only if online logs are damaged.
-- After mounting the created controlfile, the following SQL
-- statement will place the database in the appropriate
-- protection mode:
--  ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE PERFORMANCE
STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "XIENFEI" RESETLOGS FORCE LOGGING ARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 2921
LOGFILE
  GROUP 1 '/u01/oradata/xienfei/redo01.log'  SIZE 5M,
  GROUP 2 '/u01/oradata/xienfei/redo02.log'  SIZE 5M,
  GROUP 3 '/u01/oradata/xienfei/redo03.log'  SIZE 5M
-- STANDBY LOGFILE
DATAFILE
  '/u01/oradata/xienfei/system01.dbf',
  '/u01/oradata/xienfei/undotbs01.dbf',
  '/u01/oradata/xienfei/sysaux01.dbf',
  '/u01/oradata/xienfei/users01.dbf',
  '/u01/oradata/xienfei/example01.dbf',
  '/u01/oradata/xienfei/xff01.dbf',
  '/u01/oradata/xienfei/cf01.dbf',
  '/u01/oradata/xienfei/cfxff01.dbf'
CHARACTER SET ZHS16GBK
;
-- Commands to re-create incarnation table
-- Below log names MUST be changed to existing filenames on
-- disk. Any one log file from each branch can be used to
-- re-create incarnation records.
#  注册归档日志
-- ALTER DATABASE REGISTER LOGFILE '/u01/archive/1_1_756837539.arc';
-- Recovery is required if any of the datafiles are restored backups,
-- or if the last shutdown was not normal or immediate.
#  使用历史控制文件恢复数据库(因为redo log是新建,所以对于redo 而言,控制文件是新建的)
RECOVER DATABASE USING BACKUP CONTROLFILE
-- Database can now be opened zeroing the online logs.
#  使用RESETLOGS方式打开数据库
ALTER DATABASE OPEN RESETLOGS;
-- Commands to add tempfiles to temporary tablespaces.
-- Online tempfiles have complete space information.
-- Other tempfiles may require adjustment.
#  添加临时表空间
ALTER TABLESPACE TEMP ADD TEMPFILE '/u01/oradata/xienfei/temp01.dbf'
     SIZE 32505856  REUSE AUTOEXTEND ON NEXT 655360  MAXSIZE 32767M;
-- End of tempfile additions.

EMD_MAINTENANCE.EXECUTE_EM_DBMS_JOB_PROCS维护

1、建立
1.1)emd_maintenance包
exec sysman.emd_maintenance.submit_em_dbms_jobs;
commit;
1.2)创建job

DECLARE
  X NUMBER;
BEGIN
  SYS.DBMS_JOB.SUBMIT
  ( job       => X
   ,what      => 'EMD_MAINTENANCE.EXECUTE_EM_DBMS_JOB_PROCS();'
   ,next_date => to_date('18/08/2011 21:59:17','dd/mm/yyyy hh24:mi:ss')
   ,interval  => 'SYSDATE + 1 / (24 * 60)'
   ,no_parse  => FALSE
  );
  SYS.DBMS_OUTPUT.PUT_LINE('Job Number is: ' || to_char(x));
COMMIT;
END;

2、停止和启动
2.1)停止
EXEC DBMS_JOB.BROKEN(job#,TRUE);
commit;
2.2)启用
EXEC DBMS_JOB.BROKEN(job#,FALSE);
commit;
3、删除
2.1)emd_maintenance包
exec sysman.emd_maintenance.remove_em_dbms_jobs;
commit;
3.2)删除job
dbms_job.remove(job#);
commit;
注:因为EMD_MAINTENANCE.EXECUTE_EM_DBMS_JOB_PROCS是sysman用户下面用于收集em相关信息,如果要进行操作,最好使用sysman用户进行

Linux逻辑卷管理(LVM)

前言:LVM基本术语
物理存储介质(The physical media)
这里指系统的存储设备:硬盘,如:/dev/hda、/dev/sda等等,是存储系统最低层的存储单元。
物理卷(physical volume)
物理卷就是指硬盘分区或从逻辑上与磁盘分区具有同样功能的设备(如RAID),是LVM的基本存储逻辑块,但和基本的物理存储介质(如分区、磁盘等)比较,却包含有与LVM相关的管理参数。
卷组(Volume Group)
LVM卷组类似于非LVM系统中的物理硬盘,其由物理卷组成。可以在卷组上创建一个或多个“LVM分区”(逻辑卷),LVM卷组由一个或多个物理卷组成。
逻辑卷(logical volume)
LVM的逻辑卷类似于非LVM系统中的硬盘分区,在逻辑卷之上可以建立文件系统(比如/home或者/usr等)。
PE(physical extent)
每一个物理卷被划分为称为PE(Physical Extents)的基本单元,具有唯一编号的PE是可以被LVM寻址的最小单元。PE的大小是可配置的,默认为4MB。
LE(logical extent)
逻辑卷也被划分为被称为LE(Logical Extents) 的可被寻址的基本单位。在同一个卷组中,LE的大小和PE是相同的,并且一一对应。
一、创建逻辑卷的步骤
1)通过pvcreate命令将linux分区处理成物理卷(PV);
2)通过vgcreate命令将创建好的物理卷处理成卷组(Vg);
3)通过lvcreate命令将卷组分成若干个逻辑卷(Lv);
之后我们可以对逻辑卷进行格式化,挂载,删除等操作,我们可以动态的调整逻辑卷的大小,并且该操作不会影响我们在逻辑卷(Lv)上的数据。
二、物理卷创建管理操作
1、物理分区信息
[root@node1 ~]# fdisk /dev/sdd
The number of cylinders for this disk is set to 2610.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): p
Disk /dev/sdd: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdd1 1 200 1606468+ 83 Linux
/dev/sdd2 201 400 1606500 83 Linux
/dev/sdd3 401 600 1606500 83 Linux
/dev/sdd4 601 2610 16145325 5 Extended
/dev/sdd5 601 800 1606468+ 83 Linux
/dev/sdd6 801 1000 1606468+ 83 Linux
/dev/sdd7 1001 1200 1606468+ 83 Linux
/dev/sdd8 1201 1400 1606468+ 83 Linux
/dev/sdd9 1600 1800 1614532+ 83 Linux
/dev/sdd10 1401 1599 1598404+ 83 Linux
/dev/sdd11 1801 2000 1606468+ 83 Linux
/dev/sdd12 2001 2200 1606468+ 83 Linux
/dev/sdd13 2201 2400 1606468+ 83 Linux
/dev/sdd14 2401 2610 1686793+ 83 Linux
Partition table entries are not in disk order
2、创建物理卷
[root@node1 ~]# pvcreate /dev/sdd*
Device /dev/sdd not found (or ignored by filtering).
Physical volume “/dev/sdd1” successfully created
Physical volume “/dev/sdd10” successfully created
Physical volume “/dev/sdd11” successfully created
Physical volume “/dev/sdd12” successfully created
Physical volume “/dev/sdd13” successfully created
Physical volume “/dev/sdd14” successfully created
Physical volume “/dev/sdd2” successfully created
Physical volume “/dev/sdd3” successfully created
Device /dev/sdd4 not found (or ignored by filtering).
Physical volume “/dev/sdd5” successfully created
Physical volume “/dev/sdd6” successfully created
Physical volume “/dev/sdd7” successfully created
Physical volume “/dev/sdd8” successfully created
Physical volume “/dev/sdd9” successfully created
3、删除物理卷
[root@node1 ~]# pvremove /dev/sdd1
Labels on physical volume “/dev/sdd1” successfully wiped
4、添加物理卷
[root@node1 ~]# pvcreate /dev/sdd1
Physical volume “/dev/sdd1” successfully created
5、查看物理卷信息
[root@node1 ~]# pvscan
PV /dev/sdd1 lvm2 [1.53 GB]
PV /dev/sdd2 lvm2 [1.53 GB]
PV /dev/sdd3 lvm2 [1.53 GB]
PV /dev/sdd5 lvm2 [1.53 GB]
PV /dev/sdd6 lvm2 [1.53 GB]
PV /dev/sdd7 lvm2 [1.53 GB]
PV /dev/sdd8 lvm2 [1.53 GB]
PV /dev/sdd9 lvm2 [1.54 GB]
PV /dev/sdd10 lvm2 [1.52 GB]
PV /dev/sdd11 lvm2 [1.53 GB]
PV /dev/sdd12 lvm2 [1.53 GB]
PV /dev/sdd13 lvm2 [1.53 GB]
PV /dev/sdd14 lvm2 [1.61 GB]
Total: 13 [19.99 GB] / in use: 0 [0 ] / in no VG: 13 [19.99 GB]
6、查看物理卷详细参数
root@node1 ~]# pvdisplay /dev/sdd5
“/dev/sdd5” is a new physical volume of “1.53 GB”
— NEW Physical volume —
PV Name /dev/sdd5
VG Name
PV Size 1.53 GB
Allocatable NO
PE Size (KByte) 0
Total PE 0
Free PE 0
Allocated PE 0
PV UUID 7Ms1jT-wxUI-2d7l-h24c-t17F-J9vL-fz2zHM
二、卷组的创建管理
1、创建卷组
[root@node1 ~]# vgcreate -s 8M xifenfei /dev/sdd{1,2,3,5,6,7,8,9,10,11,12,13,14}
Volume group “xifenfei” successfully created
格式:vgcreate [-s <8M|16M|...>] xifenfei /dev/sdb{1,2,3…}
-s:指定扩展块(PE)的大小,默认为4MB;(相当与磁盘上的簇,扩展逻辑卷的基本单位。 后面的值可以是8M 16M 32M 64M …..最多65532个扩展块。
xifenfei:新创建的卷组的名字
2、查看卷组信息
[root@node1 ~]# vgscan
Reading all physical volumes. This may take a while…
Found volume group “xifenfei” using metadata type lvm2
[root@node1 ~]# pvscan
PV /dev/sdd1 VG xifenfei lvm2 [1.53 GB / 1.53 GB free]
PV /dev/sdd2 VG xifenfei lvm2 [1.53 GB / 1.53 GB free]
PV /dev/sdd3 VG xifenfei lvm2 [1.53 GB / 1.53 GB free]
PV /dev/sdd5 VG xifenfei lvm2 [1.53 GB / 1.53 GB free]
PV /dev/sdd6 VG xifenfei lvm2 [1.53 GB / 1.53 GB free]
PV /dev/sdd7 VG xifenfei lvm2 [1.53 GB / 1.53 GB free]
PV /dev/sdd8 VG xifenfei lvm2 [1.53 GB / 1.53 GB free]
PV /dev/sdd9 VG xifenfei lvm2 [1.54 GB / 1.54 GB free]
PV /dev/sdd10 VG xifenfei lvm2 [1.52 GB / 1.52 GB free]
PV /dev/sdd11 VG xifenfei lvm2 [1.53 GB / 1.53 GB free]
PV /dev/sdd12 VG xifenfei lvm2 [1.53 GB / 1.53 GB free]
PV /dev/sdd13 VG xifenfei lvm2 [1.53 GB / 1.53 GB free]
PV /dev/sdd14 VG xifenfei lvm2 [1.60 GB / 1.60 GB free]
Total: 13 [19.98 GB] / in use: 13 [19.98 GB] / in no VG: 0 [0 ]
3、查看卷组的详细参数
[root@node1 ~]# vgdisplay /dev/xifenfei
— Volume group —
VG Name xifenfei
System ID
Format lvm2
Metadata Areas 13
Metadata Sequence No 1
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 0
Open LV 0
Max PV 0
Cur PV 13
Act PV 13
VG Size 19.98 GB
PE Size 8.00 MB
Total PE 2557
Alloc PE / Size 0 / 0
Free PE / Size 2557 / 19.98 GB
VG UUID dGa5e5-Jjef-GhYN-jpP0-EHMj-pAK1-Y9jJG2
4、删除卷组中物理卷
[root@node1 ~]# vgreduce xifenfei /dev/sdd3
Removed “/dev/sdd3” from volume group “xifenfei”
5、添加物理卷到卷组中
[root@node1 ~]# vgextend xifenfei /dev/sdd3
Volume group “xifenfei” successfully extended
6、删除卷组
[root@node1 ~]# vgremove /dev/xifenfei
Volume group “xifenfei” successfully removed
四、逻辑卷的创建及管理
1、创建逻辑卷
1.1)
[root@node1 ~]# lvcreate -L 2G -n data xifenfei
Logical volume “data” created
-L 2G:设置逻辑卷的大小为512M。
-n data:设置逻辑卷的名字为data ;表示法:/dev/卷组名/data
xifenfei :设置是有那个卷组生成的逻辑卷
1.2)
[root@node1 ~]# lvcreate -l 128 -n xff xifenfei
Logical volume “xff” created
-l 128:生成的逻辑卷使用128个扩展块,逻辑卷大小=32*PE(默认4M,本处为8M)则为1024M
-n xff:逻辑卷的名字;
xifenfei :设置是有那个卷组生成的逻辑卷
2、删除逻辑卷
[root@node1 ~]# lvremove /dev/xifenfei/xff
Do you really want to remove active logical volume “xff”? [y/n]: y
Logical volume “xff” successfully removed
3、查看逻辑卷信息
[root@node1 ~]# lvscan
ACTIVE ‘/dev/xifenfei/data’ [2.00 GB] inherit
ACTIVE ‘/dev/xifenfei/xff’ [1.00 GB] inherit
4、逻辑卷详细信息
[root@node1 ~]# lvdisplay /dev/xifenfei/data
— Logical volume —
LV Name /dev/xifenfei/data
VG Name xifenfei
LV UUID 84pp2v-GnfP-X3cL-a3fj-q3Cs-FDjh-i2d02u
LV Write Access read/write
LV Status available
# open 0
LV Size 2.00 GB
Current LE 256
Segments 2
Allocation inherit
Read ahead sectors auto
– currently set to 256
Block device 253:0
5、格式化逻辑卷
[root@node1 ~]# mkfs.ext3 /dev/xifenfei/data
mke2fs 1.35 (28-Feb-2004)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
262144 inodes, 524288 blocks
26214 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=536870912
16 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 29 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
6、挂载逻辑卷
[root@node1 ~]# mkdir /data
[root@node1 ~]# mount /dev/xifenfei/data /data
[root@node1 ~]# df -h
Filesystem 容量 已用 可用 已用% 挂载点
/dev/sda2 18G 9.3G 7.5G 56% /
/dev/sda1 99M 13M 82M 14% /boot
none 395M 0 395M 0% /dev/shm
/dev/drbd0 1.6G 656M 810M 45% /opt/mysql
/dev/mapper/xifenfei-data
2.0G 36M 1.9G 2% /data
7、增大逻辑卷大小(因为减小实际情况中很少用,所以不做实验lvreduce)
[root@node1 ~]# lvextend -L 5.5G /dev/xifenfei/data
Extending logical volume data to 5.50 GB
Logical volume data successfully resized
[root@node1 ~]# resize2fs /dev/xifenfei/data
resize2fs 1.35 (28-Feb-2004)
/dev/xifenfei/data is mounted; can’t resize a mounted filesystem!
[root@node1 ~]# umount /data
[root@node1 ~]# resize2fs /dev/xifenfei/data
resize2fs 1.35 (28-Feb-2004)
Please run ‘e2fsck -f /dev/xifenfei/data’ first.
[root@node1 ~]# e2fsck -f /dev/xifenfei/data
e2fsck 1.35 (28-Feb-2004)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/xifenfei/data: 11/655360 files (9.1% non-contiguous), 29800/1310720 blocks
[root@node1 ~]# resize2fs /dev/xifenfei/data
resize2fs 1.35 (28-Feb-2004)
Resizing the filesystem on /dev/xifenfei/data to 1441792 (4k) blocks.
The filesystem on /dev/xifenfei/data is now 1441792 blocks long.
[root@node1 ~]# mount /dev/xifenfei/data /data
[root@node1 ~]# df -h
Filesystem 容量 已用 可用 已用% 挂载点
/dev/sda2 18G 9.3G 7.5G 56% /
/dev/sda1 99M 13M 82M 14% /boot
none 395M 0 395M 0% /dev/shm
/dev/drbd0 1.6G 656M 810M 45% /opt/mysql
/dev/mapper/xifenfei-data
5.5G 37M 5.3G 1% /data
8、删除逻辑卷
[root@node1 ~]# lvremove /dev/xifenfei/xff
Do you really want to remove active logical volume “xff”? [y/n]: y
Logical volume “xff” successfully removed

DRBD配置说明

一、配置简介
DRBD运行时,会读取一个配置文件/etc/drbd.conf.这个文件里描述了DRBD设备与硬盘分区的映射关系
1、全局配置项(global)
基本上我们可以做的也就是配置usage-count是yes还是no了,usage-count参数其实只是为了让linbit公司收集目前drbd的使用情况。当drbd在安装和升级的时候会通过http协议发送信息到linbit公司的服务器上面
2、公共配置项(common)
这里的common,指的是drbd所管理的多个资源之间的common。配置项里面主要是配置drbd的所有resource可以设置为相同的参数项,比如protocol,syncer等等
3、资源配置项(resource)
resource项中配置的是drbd所管理的所有资源,包括节点的ip信息,底层存储设备名称,设备大小,meta信息存放方式,drbd对外提供的设 备名等等。每一个resource中都需要配置在每一个节点的信息,而不是单独本节点的信息。实际上,在drbd的整个集群中,每一个节点上面的 drbd.conf文件需要是完全一致的
4、另外,resource还有很多其他的内部配置项:
(1)net:网络配置相关的内容,可以设置是否允许双主节点(allow-two-primaries)等。
(2)startup:启动时候的相关设置,比如设置启动后谁作为primary(或者两者都是primary:become-primary-on both)
(3)syncer:同步相关的设置。
可以设置“重新”同步(re-synchronization)速度(rate)设置,也可以设置是否在线校验节点之间的 数据一致性(verify-alg 检测算法有md5,sha1以及crc32等)。数据校验可能是一个比较重要的事情,在打开在线校验功能后,我们可以通过相关命令(drbdadm verify resource_name)来启动在线校验。在校验过程中,drbd会记录下节点之间不一致的block,但是不会阻塞任何行为,即使是在该不一致的 block上面的io请求。当不一致的block发生后,drbd就需要有re-synchronization动作,而syncer里面设置的rate 项,主要就是用于re-synchronization的时候,因为如果有大量不一致的数据的时候,我们不可能将所有带宽都分配给drbd做re-synchronization,这样会影响对外提提供服务。rate的设置和还需要考虑IO能力的影响。如果我们会有一个千兆网络出口,但是我们的磁盘 IO能力每秒只有50M,那么实际的处理能力就只有50M,一般来说,设置网络IO能力和磁盘IO能力中最小者的30%的带宽给re-synchronization是比较合适的(官方说明)。另外,drbd还提供了一个临时的rate更改命令,可以临时性的更改syncer的rate 值:drbdsetup /dev/drbd0 syncer -r 100M。这样就临时的设置了re-synchronization的速度为100M。不过在re-synchronization结束之后,你需要通过 drbdadm adjust resource_name 来让drbd按照配置中的rate来工作
二、资源管理
1、增加resource的大小
当遇到我们的drbd resource设备容量不够的时候,而且我们的底层设备支持在线增大容量的时候(比如使用lvm的情况下),我们可以先增大底层设备的大小,然后再通过drbdadm resize resource_name来实现对resource的扩容。但是这里有一点需要注意的就是只有在单primary模式下可以这样做,而且需要先在所有节点上都增大底层设备的容量。然后仅在primary节点上执行resize命令。在执行了resize命令后,将触发一次当前primary节点到其他所有secondary节点的re-synchronization
如果我们在drbd非工作状态下对底层设备进行了扩容,然后再启动drbd,将不需要执行resize命令(当然前提是在配置文件中没有对disk参数项指定大小),drbd自己会知道已经增大了容量
在进行底层设备的增容操作的时候千万不要修改到原设备上面的数据,尤其是drbd的meta信息,否则有可能毁掉所有数据
2、收缩resource容量
容量收缩比扩容操作要危险得多,因为该操作更容易造成数据丢失。在收缩resource的容量之前,必须先收缩drbd设备之上的容量,也就是文件系统的大小。如果上层文件系统不支持收缩,那么resource也没办法收缩容量。
如果在配置drbd的时候将meta信息配置成internal的,那么在进行容量收缩的时候,千万别只计算自身数据所需要的空间大小,还要将drbd的meta信息所需要的空间大小加上。
当文件系统收缩好以后,就可以在线通过以下命令来重设resource的大小: drbdadm –size=***G resize resource_name。在收缩的resource的大小之后,你就可以自行收缩释放底层设备空间(如果支持的话)
如果打算停机状态下收缩容量,可以通过以下步骤进行:
(1)在线收缩文件系统
(2)停用drbd的resource:drbdadm down resourcec_name
(3)导出drbd的metadata信息(在所有节点都需要进行):drbdadm dump-md resource_name > /path_you_want_to_save/file_name
(4)在所有节点收缩底层设备
(5)更改上面dump出来的meta信息的la-size-sect项到收缩后的大小(是换算成sector的数量后的数值)
(6)如果使用的是internal来配置meta-data信息,则需要重新创建meta-data:drbdadm create-md resource_name
(7)将之前导出并修改好的meta信息重新导入drbd(摘录自linbit官方网站的一段导入代码): drbdmeta_cmd=$(drbdadm -d dump-md test-disk)
${drbdmeta_cmd/dump-md/restore-md} /path_you_want_to_save/file_name
(8)启动resource:drbdadm up resource_name
三、磁盘损坏
1、detach resource(分离资源)
如果在resource的disk配置项中配置了on_io_error为pass_on的话,那么drbd在遇到磁盘损坏后不会自己detach底层设备。也就是说需要我们手动执行detach的命令(drbdadm detach resource_name),然后再查看当前各节点的ds信息。可以通过cat /proc/drbd来查看,也可以通过专有命令来查看:drbdadm dstat resource_name。当发现损坏的那方已经是Diskless后,即可。如果我们没有配置on_io_error或者配置成detach的话,那 么上面的操作将会由自动进行。
另外,如果磁盘损坏的节点是当前主节点,那么我们需要进行节点切换的操作后再进行上面的操作
2、更换磁盘
当detach了resource之后,就是更换磁盘了。如果我们使用的是internal的meta-data,那么在换好磁盘后,只需要重新创建 mata-data (drbdadm create-md resource_name),再将resource attach上 (drbdadm attach resource_name),然后drbd就会马上开始从当前primary节点到本节点的re-synchronisation。数据同步的实时状况 可以通过 /proc/drbd文件的内容获得
不过,如果我们使用的不是internal的meta-data保存方式,也就是说我们的meta-data是保存在resource之外的地方的。那么 我们在完成上面的操作(重建meta-data)之后,还需要进行一项操作来触发re-synchnorisation,所需命令为:drbdadm invalidate resource_name
3、节点crash(或计划内维护)
(1)secondary节点
如果是secondary接待你crash,那么primary将临时性的与secondary断开连接,cs状态应该会变成WFConnection, 也就是等待连接的状态。这时候primary会继续对外提供服务,并在meta-data里面记录下从失去secondary连接后所有变化过的 block的信息。当secondary重新启动并连接上primary后,primary –> secondary的re-synchnorisation会自动开始。不过在re-synchnorisation过程中,primary和 secondary的数据是不一致状态的。也就是说,如果这个时候primary节点也crash了的话,secondary是没办法切换成 primary的。也就是说,如果没有其他备份的话,将丢失所有数据
(2)primary节点
一般情况下,primary的crash和secondary的crash所带来的影响对drbd来说基本上是差不多的。唯一的区别就是需要多操作一步将 secondary节点switch成primary节点先对外提供服务。这个switch的过程drbd自己是不会完成的,需要我们人为干预进行一些操作才能完成。当crash的原primary节点修复并重新启动连接到现在的primary后,会以secondary存在,并开始re-synchnorisation这段时间变化的数据
在primary节点crash的情况下,drbd可以保证同步到原secondary的数据的一致性,这样就避免了当primary节点crash之后,secondary因为数据的不一致性而无法wcitch成primary或者即使切换成primary后因为不一致的数据无法提供正常的服务的问题
4、节点永久性损坏(需要更换机器或重新安装相关软件的情况)
当某一个节点因为硬件(或软件)的问题,导致某一节点已经无法再轻易修复并提供服务,也就是说我们所面对的是需要更换主机(或从OS层开始重新安装)的问题。在遇到这样的问题后,我们所需要做的是重新提供一台和原节点差不多的机器,重新开始安装os,安装相关软件,从现有整提供服务的节点上copy出 drbd的配置文件(/etc/drbd.conf),创建meta-data信息,然后启动drbd服务,以一个secondary的身份连接到现有的 primary上面,后面就会自动开始re-synchnorisation
5、split brain的处理
split brain实际上是指在某种情况下,造成drbd的两个节点断开了连接,都以primary的身份来运行。当drbd某primary节点连接对方节点准 备发送信息的时候如果发现对方也是primary状态,那么会会立刻自行断开连接,并认定当前已经发生split brain了,这时候他会在系统日志中记录以下信息:“Split-Brain detected,dropping connection!”当发生split brain之后,如果查看连接状态,其中至少会有一个是StandAlone状态,另外一个可能也是StandAlone(如果是同时发现split brain状态),也有可能是WFConnection的状态
如果我们在配置文件中配置了自动解决split brain(好像linbit不推荐这样做),drbd会自行解决split brain问题,具体解决策略是根据配置中的设置来进行的
如果没有配置split brain自动解决方案,我们可以手动解决。首先我们必须要确定哪一边应该作为解决问题后的primary,一旦确定好这一点,那么我们同时也就确定接受 丢失在split brain之后另外一个节点上面所做的所有数据变更了。当这些确定下来后,我们就可以通过以下操作来恢复了:
(1)首先在确定要作为secondary的节点上面切换成secondary并放弃该资源的数据:
drbdadm secondary resource_name
drbdadm — –discard-my-data connect resource_name
(2)在要作为primary的节点重新连接secondary(如果这个节点当前的连接状态为WFConnection的话,可以省略)
drbdadm connect resource_name
当作完这些动作之后,从新的primary到secondary的re-synchnorisation会自动开始
四、meta data存放地点的比较
1、internal meta-data(meta-data和数据存放在同一个底层设备之上)
(1)优点:一旦meta-data创建之后,就和实际数据绑在了一起,在维护上会更简单方便,不用担心meta-data会因为某些操作而丢失。另外在硬盘损坏丢失数据的同时,meta-data也跟着一起丢失,当更换硬盘之后,只需要执行重建meta-data的命令即可,丢失的数据会很容易的从其他节点同步过来
(2)缺点:如果底层设备是单一的磁盘,没有做raid,也不是lvm等,那么可能会造成性能影响。因为每一次写io都需要更新meta-data里面的信息, 那么每次写io都会有两次,而且肯定会有磁头的较大寻道移动,因为meta-data都是记录在dice设备的最末端的,这样就会造成写io的性能降低。
2、external meta data(meta-data存放在独立的,与存放数据的设备分开的设备之上)
(1)优点:与internal meta-data的缺点完全相对,可以解决写io的争用问题
(2)缺点:由于meta-data存放在与数据设备分开的地方,就意味着当磁盘损坏更换磁盘之后,必须手动发起全量同步的操作。