注意系统bug—linux在E5、E5 V2、E7 V2 cpu之上的bug 765720

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:注意系统bug—linux在E5、E5 V2、E7 V2 cpu之上的bug 765720

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天晚上群里面兄弟说了一个linux 6上面bug,会导致系统在运行200天以上(hardware uptime),然后进行热重启后,可能在几分钟或者几个小时内出发该bug,导致系统异常.

主要影响条件为:
Red Hat Enterprise Linux 6.1 (kernel-2.6.32-131.26.1.el6 and newer)
Red Hat Enterprise Linux 6.2 (kernel-2.6.32-220.4.2.el6 and newer)
Red Hat Enterprise Linux 6.3 (kernel-2.6.32-279 series)
Red Hat Enterprise Linux 6.4 (kernel-2.6.32-358 series)
Any Intel® Xeon® E5, Intel® Xeon® E5 v2, or Intel® Xeon® E7 v2 series processor
从这里可以看出来该问题主要影响E5、E5 V2、E7 V2 cpu上的redhat 6.1-6.4版本,在6.5版本中修复,具体参考:bug 765720
另外对已ORACLE Linux,如果使用EL Kernel影响和redhat一致,如果使用Unbreakable Enterprise Kernel则在6.2版本中进行了修复该问题。
MOS上类似文章:Oracle Linux 6 RHCK system hang: processes blocked in ext4_file_open(), pick_next_task_fair()

补充说明:
1. 在Red Hat/OEL 5.x版本中不存在。
2. 在32和64位操作系统都有可能发生
3. 鉴于该bug短期内无法修复,而且真的发生了,考虑冷重启主机,临时规避

再次提醒:系统版本选定也很重要,大家在选择Linux版本之时尽量选择避开该bug(el kernel 6.5及其以后版本,uek kernel 6.2及其以后版本)。个人倾向:如果是部署ORACLE db,而且还是redhat系列Linux,更加倾向OEL(省事,相信Oracle)

记录一次rm -rf 删除数据文件异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:记录一次rm -rf 删除数据文件异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

因为人员离职闹得不愉快,系统工程师离职后,由于公司未及时关闭其vpn,数据库服务器(Linux 6.5 Oracle 11.2.0.1)帐号未及时被修改,最后直接上去rm ORACLE_BASE给干掉,悲剧的是ORADATA目录也在里面,更加悲剧的是所有数据文件都在里面.也就是说数据库彻底被删除,而且没有任何备份.朋友咨询了我,让我给予支持.最后比较幸运,文件没有被覆盖,inode都还在,通过extundelete顺利恢复所有数据文件,控制文件,redo文件(extundelete恢复Linux被删除文件),数据库顺利打开,实现0丢失,算是一次完美的恢复

[root@DB1 tmp]# tar xvf extundelete-0.2.4.tar
extundelete-0.2.4/
extundelete-0.2.4/acinclude.m4
extundelete-0.2.4/missing
extundelete-0.2.4/autogen.sh
extundelete-0.2.4/aclocal.m4
extundelete-0.2.4/configure
extundelete-0.2.4/LICENSE
extundelete-0.2.4/README
extundelete-0.2.4/install-sh
extundelete-0.2.4/config.h.in
extundelete-0.2.4/src/
extundelete-0.2.4/src/extundelete.cc
extundelete-0.2.4/src/block.h
extundelete-0.2.4/src/kernel-jbd.h
extundelete-0.2.4/src/insertionops.cc
extundelete-0.2.4/src/block.c
extundelete-0.2.4/src/cli.cc
extundelete-0.2.4/src/extundelete-priv.h
extundelete-0.2.4/src/extundelete.h
extundelete-0.2.4/src/jfs_compat.h
extundelete-0.2.4/src/Makefile.in
extundelete-0.2.4/src/Makefile.am
extundelete-0.2.4/configure.ac
extundelete-0.2.4/depcomp
extundelete-0.2.4/Makefile.in
extundelete-0.2.4/Makefile.am
[root@DB1 tmp]# cd extundelete-0.2.4
[root@DB1 extundelete-0.2.4]# ./configure
Configuring extundelete 0.2.4
Writing generated files to disk
[root@DB1 extundelete-0.2.4]# make && make install
make -s all-recursive
Making all in src
Making install in src
  /usr/bin/install -c extundelete '/usr/local/bin'
[root@DB1 extundelete-0.2.4]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       244G   11G  221G   5% /
tmpfs            16G   72K   16G   1% /dev/shm
/dev/sda1       190M   62M  119M  35% /boot
/dev/sdb1       2.0T   71M  1.9T   1% /home
[root@DB1 extundelete-0.2.4]# umount /dev/sdb1
umount: /home: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
/home:                3914c(oracle)  8372c(oracle)
[root@DB1 extundelete-0.2.4]# kill -9 3914
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
/home:                8372c(oracle)
[root@DB1 extundelete-0.2.4]# kill -9 8372
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
[root@DB1 extundelete-0.2.4]# umount /dev/sdb1
[root@DB1 extundelete-0.2.4]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       244G   11G  221G   5% /
tmpfs            16G   72K   16G   1% /dev/shm
/dev/sda1       190M   62M  119M  35% /boot
[root@DB1 extundelete-0.2.4]# extundelete /dev/sdb1 --restore-all
NOTICE: Extended attributes are not restored.
Loading filesystem metadata ... 16384 groups loaded.
Loading journal descriptors ... 26542 descriptors loaded.
Searching for recoverable inodes in directory / ...
18896 recoverable inodes found.
Looking through the directory structure for deleted files ...
2 recoverable inodes still lost.
Unable to restore inode 43778050 (file.43778050): Space has been reallocated.
[root@DB1 extundelete-0.2.4]# ls
acinclude.m4  autogen.sh  config.h.in  config.status  configure.ac  install-sh  Makefile     Makefile.in
aclocal.m4    config.h    config.log   configure      depcomp       LICENSE     Makefile.am  missing
[root@DB1 extundelete-0.2.4]# cd RECOVERED_FILES/
[root@DB1 RECOVERED_FILES]# ls
app  file.43778051  oracle  oraInventory
[root@DB1 RECOVERED_FILES]# cd app
[root@DB1 app]# ls
admin  cfgtoollogs  diag  oracle  oradata  orcl  ORCL
[root@DB1 app]# cd oradata
[root@DB1 oradata]# ls
orcl
[root@DB1 oradata]# cd orcl
[root@DB1 orcl]# ls
control01.ctl  redo01.log  redo02.log  redo03.log  sysaux01.dbf  system01.dbf  undotbs01.dbf  users01.dbf
[root@DB1 orcl]# ls -ltr
total 2908776
-rw-r--r--. 1 root root  734011392 Nov 18 02:06 system01.dbf
-rw-r--r--. 1 root root 1069555712 Nov 18 02:06 sysaux01.dbf
-rw-r--r--. 1 root root  120594432 Nov 18 02:06 undotbs01.dbf
-rw-r--r--. 1 root root  887365632 Nov 18 02:06 users01.dbf
-rw-r--r--. 1 root root    9748480 Nov 18 02:06 control01.ctl
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo01.log
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo02.log
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo03.log
[root@DB1 orcl]#

再次提醒各位:数据库备份重于一切,防天灾的同时还要防人灾,也希望圈子里面以后不要听到类似故障.

multipath实现设备用户组设置

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:multipath实现设备用户组设置

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

现在的Linux系统中,很多都会使用系统自带的multipath多路径软件,在以前的版本中,我们一般通过multipath+udev或者multipath+rc.local来实现多路径和权限设置,而在redhat 5.3-5.11的版本中multipath就直接可以实现多路径聚合、设备持久化、用户组设置
操作系统版本

[root@rac1 dev]# uname -r
2.6.39-300.26.1.el5uek
[root@rac1 dev]# more /etc/issue
Oracle Linux Server release 5.9
Kernel \r on an \m

fdisk记录

[root@rac1 dev]# fdisk -l
…………
Disk /dev/sdh: 134.2 GB, 134217728000 bytes
255 heads, 63 sectors/track, 16317 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
Disk /dev/sdi: 33.5 GB, 33554432000 bytes
64 heads, 32 sectors/track, 32000 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
   Device Boot      Start         End      Blocks   Id  System

multipath包
检查安装multipath相关包(该版本系统默认安装)

[root@rac1 dev]# rpm -aq|grep mapper
device-mapper-multipath-libs-0.4.9-56.0.3.el5
device-mapper-event-1.02.67-2.el5
device-mapper-1.02.67-2.el5
device-mapper-multipath-0.4.9-56.0.3.el5

获取wwid值

[root@rac1 dev]# /sbin/scsi_id -g -u -s /block/sdh
14f504e46494c45527049754962662d395751372d68356743
[root@rac1 dev]# /sbin/scsi_id -g -u -s /block/sdi
14f504e46494c4552484d486249782d464471382d354f4b58

获取uid和gid

[root@rac1 dev]# id grid
uid=1100(grid) gid=54321(oinstall) groups=54321(oinstall),1020(asmadmin),1021(asmdba)

multipath.conf配置

[root@rac1 dev]# vi /etc/multipath.conf
defaults {
user_friendly_names no
queue_without_daemon no
flush_on_last_del yes
max_fds max
}
blacklist {
devnode "^hd[a-z]"
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^cciss.*"
}
devices {
        device {
                vendor                  "OPNFILER "
                product                 "LUN"
                path_grouping_policy    group_by_prio
                features             "3 queue_if_no_path pg_init_retries 50"
                getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
                path_checker            tur
                path_selector           "round-robin 0"
                hardware_handler        "1 alua"
                failback               immediate
                rr_weight              uniform
                rr_min_io             128
        }
}
multipaths {
        multipath {
                wwid                    14f504e46494c45527049754962662d395751372d68356743  #wwid
                alias                   xifenfei128
                uid                     1100                                               #uid
                gid                     1020                                               #gid
        }
        multipath {
                wwid                    14f504e46494c4552484d486249782d464471382d354f4b58   #wwid
                alias                   xifenfei32
                uid                     1100                                                #uid
                gid                     1020                                                #gid
         }
}

启动multipath

[root@rac1 dev]# modprobe dm-multipath
[root@rac1 dev]# modprobe dm-round-robin
[root@rac1 dev]# chkconfig multipathd on
[root@rac1 dev]# service multipathd start
Starting multipathd daemon: [  OK  ]
[root@rac1 dev]# multipath -F
[root@rac1 dev]# multipath -v2
create: xifenfei128 (14f504e46494c45527049754962662d395751372d68356743) undef OPNFILER,VIRTUAL-DISK
size=125G features='0' hwhandler='0' wp=undef
`-+- policy='round-robin 0' prio=1 status=undef
  `- 3:0:0:9  sdh 8:112 undef ready  running
create: xifenfei32 (14f504e46494c4552484d486249782d464471382d354f4b58) undef OPNFILER,VIRTUAL-DISK
size=31G features='0' hwhandler='0' wp=undef
`-+- policy='round-robin 0' prio=1 status=undef
  `- 3:0:0:10 sdi 8:128 undef ready  running

查看生成多路径设备
注意设备名称、组、用户

[root@rac1 dev]# ls -l /dev/mapper/xifenfei*
brw-rw---- 1 grid asmadmin 252, 2 Jan  7 21:21 /dev/mapper/xifenfei128
brw-rw---- 1 grid asmadmin 252, 3 Jan  7 21:21 /dev/mapper/xifenfei32

补充Linux 6.x中udev设置所属组和权限
对于linux 6.x,multipath不能设置磁盘所属组和权限,可以通过udev进行实现,类似配置如下

[root@bxrac03 mapper]#cat 99-diskownership.rules
SUBSYSTEM!="block", GOTO="quickexit"
KERNEL!="dm-*", GOTO="quickexit"
PROGRAM=="/sbin/dmsetup info -c --noheadings -o name -m %m -j %M"
RESULT=="*ocr*", OWNER="grid", GROUP="oinstall", MODE="0660"
RESULT=="*oradata", OWNER="grid", GROUP="oinstall", MODE="0660"
RESULT=="*backup", OWNER="grid", GROUP="oinstall", MODE="0660"
LABEL="quickexit"

其中RESULT和dm的别名向匹配

shell脚本获得extents分布

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:shell脚本获得extents分布

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

比较深入看过dba_extents视图的朋友都知道,它得到extent的信息不是通过普通的存储在数据库中的基表获得,而是x$相关的表获得(x$表是数据库启动时候在内存中创建,不存在数据文件中),因为当数据库未正常启动,我们无法直接确定某个block是否在某个对象中.其实关于extent的信息都已经记录在了segment header的block中,通过dump该block记录的rdba信息,未转化为file_id和block_id,这里写shell脚本实现把segment header dump 内容转化为类似dba_extents记录,方便在某些不能open的库中分析某个异常block是否属于某个表

#! /bin/bash
dec2bin(){
  val_16=$1
  ((num=$val_16));
  val=`echo $num`
  local base=$2
  [ $val -eq 0 ] && bin=0
if [ $val -ge $base ]; then
    dec2bin $val $((base*2))
    if [ $val -ge $base ]; then
      val=$(($val-$base))
      bin=${bin}1
    else
      bin=${bin}0
    fi
  fi
  [ $base -eq 1 ] && printf  $bin
}
for i in `grep "length:" $1 |awk '{print $1 $3}'`;
do
rdba=`echo ${i:0:10}`
blocks=`echo ${i:10}`
echo -n "rdba:"$rdba"    "
bin2=`dec2bin $rdba  1`
len=`expr length $bin2`
len_gd=22
len_jg=`expr $len - $len_gd`
file_no_2=`echo ${bin2:0:$len_jg}`
((file_no=2#$file_no_2))
echo -n "file_id:"$file_no"    "
block_no_2=`echo ${bin2:$len_jg}`
((block_no=2#$block_no_2))
echo -n "block_id:"$block_no"    "
echo  "blocks:"$blocks
done;

trace文件中部分信息

  -----------------------------------------------------------------
   0x00400901  length: 7
   0x00402e10  length: 8
   0x00402e60  length: 8
   0x00402e68  length: 8
   0x00402ea0  length: 8
   0x00402f20  length: 8
   0x00402f48  length: 8
   0x00403050  length: 8
   0x00403180  length: 8
   0x00403b38  length: 8
   0x00404c48  length: 8
   0x00404c78  length: 8
   0x00404cf8  length: 8
   0x00404da8  length: 8
   0x00404db8  length: 8
   0x00404de8  length: 8
   0x00404e80  length: 128
   0x00405900  length: 128
   0x00406500  length: 128
   0x00406980  length: 128
   0x00407480  length: 128
   0x00407500  length: 128
   0x00407680  length: 128
   0x00407800  length: 128
   0x00407880  length: 128
   0x00407a00  length: 128
   0x00407a80  length: 128
   0x00407c80  length: 128
…………

执行结果

[oracle@xifenfei tmp]$ ./get_extent.sh /u01/app/oracle/diag/rdbms/cdb/cdb/trace/cdb_ora_29565.trc
rdba:0x00400901    file_id:1    block_id:2305     blocks:7
rdba:0x00402e10    file_id:1    block_id:11792     blocks:8
rdba:0x00402e60    file_id:1    block_id:11872     blocks:8
rdba:0x00402e68    file_id:1    block_id:11880     blocks:8
rdba:0x00402ea0    file_id:1    block_id:11936     blocks:8
rdba:0x00402f20    file_id:1    block_id:12064     blocks:8
rdba:0x00402f48    file_id:1    block_id:12104     blocks:8
rdba:0x00403050    file_id:1    block_id:12368     blocks:8
rdba:0x00403180    file_id:1    block_id:12672     blocks:8
rdba:0x00403b38    file_id:1    block_id:15160     blocks:8
rdba:0x00404c48    file_id:1    block_id:19528     blocks:8
rdba:0x00404c78    file_id:1    block_id:19576     blocks:8
rdba:0x00404cf8    file_id:1    block_id:19704     blocks:8
rdba:0x00404da8    file_id:1    block_id:19880     blocks:8
rdba:0x00404db8    file_id:1    block_id:19896     blocks:8
rdba:0x00404de8    file_id:1    block_id:19944     blocks:8
rdba:0x00404e80    file_id:1    block_id:20096     blocks:128
rdba:0x00405900    file_id:1    block_id:22784     blocks:128
rdba:0x00406500    file_id:1    block_id:25856     blocks:128
rdba:0x00406980    file_id:1    block_id:27008     blocks:128
rdba:0x00407480    file_id:1    block_id:29824     blocks:128
rdba:0x00407500    file_id:1    block_id:29952     blocks:128
rdba:0x00407680    file_id:1    block_id:30336     blocks:128
rdba:0x00407800    file_id:1    block_id:30720     blocks:128
…………

新删除data guard归档日志shell脚本

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:新删除data guard归档日志shell脚本

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

以前写过删除dataguard归档日志的方法(删除data guard归档日志),但是以前的方法确实不够灵活也不够简便,现在提供最新的一次在客户现场部署的dg删除归档日志的shell脚本

#!/bin/sh
source ~/.bash_profile
grep "Media Recovery Log" $ORACLE_BASE/admin/$ORACLE_SID/bdump/alert_${ORACLE_SID}.log|
\ awk '{print $4}'|sed -e 's/^/rm /' >/tmp/rmarchlog.sh
chmod +x /tmp/rmarchlog.sh
/tmp/rmarchlog.sh
cd $ORACLE_BASE/admin/$ORACLE_SID/bdump
cat alert_${ORACLE_SID}.log >>alert_${ORACLE_SID}.log.bak
echo ''>alert_${ORACLE_SID}.log
rm -f /tmp/rmarchlog.sh
$ORACLE_HOME/bin/rman target / <<XIFENFEI
crosscheck archivelog all;
delete expired archivelog all;
YES
exit;
XIFENFEI

根据alert日志中dg应用日志的信息”Media Recovery Log”信息来删除掉相关的归档日志,可以保证应用过的归档日志都被删除,而没有应用的归档日志都保留.

shell监控dataguard备库是否正常应用日志

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:shell监控dataguard备库是否正常应用日志

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一直在思索怎么去监控dg比较方便,又能够做到比较适用.想到了几种方法:
1.使用主备库两边的alert日志,但是这样的方法需要配置ssh,用来一个节点获取另外一个节点的alert日志
2.通过查询v$archived_log或者其他相关视图,然后主备库进行比较,但是这个需要访问另外一个库,需要另外库的登录信息
3.通过查询备库的v$archived_log视图,粗略评估dg是否工作正常.
这里我选择了3,dg的监控大部分时候是为了让人及时的发现日志应用异常,然后人工干预处理,从而减少修改gap或者重建dg的概率.而这个额监控可以在很大程度上发现dg应用归档日志异常,从而确定dg是否工作正常,如果发现工作异常,及时处理,可以减少很多工作量,甚至拯救你的数据.

#!/bin/bash
source ~/.bash_profile
#check time(M)
export CHECK_M=120
export RESULT_FILE=/tmp/dg_switch_check.log
$ORACLE_HOME/bin/sqlplus -silent "/ as sysdba" <<XFF>/tmp/check_dg.log
set pagesize 0 feedback off verify off heading off echo off
 select ceil((sysdate-next_time)*24*60) "M"
 from v\$archived_log
 where applied='YES' AND SEQUENCE#=(SELECT MAX(SEQUENCE#)
 FROM V\$ARCHIVED_LOG WHERE applied='YES');
exit;
XFF
GET_M=`cat /tmp/check_dg.log`
rm /tmp/check_dg.log
if [ ${CHECK_M} -lt ${GET_M} ];
then
    echo "check dataguard time:`date`">$RESULT_FILE
    echo "The last time application archivelog happened in $GET_M minutes ago">>$RESULT_FILE
else
 echo ''>$RESULT_FILE
fi

针对这样的脚本,根据你的dg归档切换的频率,设置监控dg的最近一次日志应用与当前时间差,然后判断dg是否工作正常.根据监控程序的特点,可以通过判断结果集文件,然后邮件/短信或者其他方式处理.

解决因/etc/fstab错误导致系统不能启动故障

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:解决因/etc/fstab错误导致系统不能启动故障

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

发现如果人品不好做试验都是问题很多,晚上又把fstab给写错了,导致系统不能起来,因为当时处理该故障未截图,后续在网上找了几张图片,大体说明处理思路
系统启动报 filesystems 失败,输入root密码进入repair filesystem模式

尝试修改 /etc/fstab 发现系统是read-only模式

重新mount -n -o remount,rw /重新mount文件系统

重新修改/etc/fstab,除掉错误记录,然后使用init 6/reboot命令重启系统

extundelete恢复Linux被删除文件

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:extundelete恢复Linux被删除文件

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天群中有个朋友的客户闲着没事rm掉了数据文件,然后讨论到使用extundelete进行了恢复,抢救了部分未被覆盖的数据文件。该软件官方地址:http://extundelete.sourceforge.net/
安装extundelete

[root@xifenfei tmp]# bunzip2 extundelete-0.2.0.tar.bz2
[root@xifenfei tmp]# tar xvf extundelete-0.2.0.tar
extundelete-0.2.0/
extundelete-0.2.0/README
extundelete-0.2.0/acinclude.m4
extundelete-0.2.0/configure.ac
extundelete-0.2.0/aclocal.m4
extundelete-0.2.0/Makefile.am
extundelete-0.2.0/Makefile.in
extundelete-0.2.0/config.h.in
extundelete-0.2.0/configure
extundelete-0.2.0/compile
extundelete-0.2.0/depcomp
extundelete-0.2.0/install-sh
extundelete-0.2.0/missing
extundelete-0.2.0/LICENSE
extundelete-0.2.0/autogen.sh
extundelete-0.2.0/src/
extundelete-0.2.0/src/Makefile.am
extundelete-0.2.0/src/Makefile.in
extundelete-0.2.0/src/extundelete.cc
extundelete-0.2.0/src/block.c
extundelete-0.2.0/src/insertionops.cc
extundelete-0.2.0/src/block.h
extundelete-0.2.0/src/extundelete.h
extundelete-0.2.0/src/extundelete-priv.h
extundelete-0.2.0/src/jfs_compat.h
extundelete-0.2.0/src/kernel-jbd.h
[root@xifenfei tmp]# cd extundelete-0.2.0
[root@xifenfei extundelete-0.2.0]#  ./configure
Configuring extundelete 0.2.0
Writing generated files to disk
[root@xifenfei extundelete-0.2.0]# make && make install
make -s all-recursive
Making all in src
Making install in src
  /usr/bin/install -c 'extundelete' '/usr/local/bin/extundelete'

恢复操作命令

--umount或者read only 分区(如果umount不成功,可以通过fuser -km /www类似方法找出来占用进程)
 umount /dev/partition
 mount -o remount,ro /dev/partition
--恢复文件
extundelete /dev/partition --restore-all
extundelete /dev/partition --restore-directory /backup/gnutool-delete
extundelete /dev/partition --restore-files /etc/passwd

测试恢复

--基于目录恢复
[root@xifenfei tmp]#extundelete /dev/sdb1  --restore-directory /u01/xifenfei
Loading filesystem metadata ... 160 groups loaded.
Loading journal descriptors ... 23 descriptors loaded.
Writing output to directory RECOVERED_FILES/
Failed to restore file /u01/xifenfei
Could not find correct inode number past inode 2.
[root@xifenfei tmp]# cd RECOVERED_FILES/
[root@xifenfei RECOVERED_FILES]# ll
total 0
--未成功
--基于磁盘恢复
[root@xifenfei RECOVERED_FILES]#extundelete /dev/sdb1  --restore-all
Loading filesystem metadata ... 160 groups loaded.
Loading journal descriptors ... 23 descriptors loaded.
Writing output to directory RECOVERED_FILES/
Searching for recoverable inodes in directory / ...
4 recoverable inodes found.
Looking through the directory structure for deleted files ...
Failed to restore inode 1966081 to file RECOVERED_FILES/xifenfei:Inode does not correspond to a regular file.
Restored inode 1966082 to file RECOVERED_FILES/xifenfei/xff.txt
Failed to restore inode 1966083 to file RECOVERED_FILES/xifenfei/xff:Inode does not correspond to a regular file.
Restored inode 1966084 to file RECOVERED_FILES/xifenfei/xff/xff.txt132
0 recoverable inodes still lost.
[root@xifenfei RECOVERED_FILES]# ll
total 4
drwxr-xr-x 3 root root 4096 Aug 31 21:36 RECOVERED_FILES
[root@xifenfei RECOVERED_FILES]# cd RECOVERED_FILES/
[root@xifenfei RECOVERED_FILES]# ll
total 4
drwxr-xr-x 3 root root 4096 Aug 31 21:36 xifenfei
[root@xifenfei RECOVERED_FILES]# cd xifenfei/
[root@xifenfei xifenfei]# ll
total 20
drwxr-xr-x 2 root root  4096 Aug 31 21:36 xff
-rw-r--r-- 1 root root 13231 Aug 31 21:36 xff.txt
[root@xifenfei xifenfei]# cd xff
[root@xifenfei xff]# ll
total 16
-rw-r--r-- 1 root root 13231 Aug 31 21:36 xff.txt132
--恢复成功
--基于文件恢复
[root@xifenfei xff]# extundelete /dev/sdb1  --restore-files /u01/xifenfei/xff.txt
Loading filesystem metadata ... 160 groups loaded.
Loading journal descriptors ... 23 descriptors loaded.
Writing output to directory RECOVERED_FILES/
[root@xifenfei xff]# cd RECOVERED_FILES/
[root@xifenfei RECOVERED_FILES]# ll
total 0
--未成功

通过上述实验证明extundelete还是有很大的不完整性,基于整个磁盘的恢复功能较为强大,基于目录和文件的恢复还不够强大.不过该软件在很多时候还是有救火的功能,特别是当蛋筒的人员删除了数据库的部分文件之时.

iscsiadm主要操作命令

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:iscsiadm主要操作命令

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

当前包含磁盘

[root@xifenfei ~]# fdisk -l
Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        2355    18916506   83  Linux
/dev/sda2            2356        2610     2048287+  82  Linux swap / Solaris
Disk /dev/sdb: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1        2610    20964793+  83  Linux
Disk /dev/sdc: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdc doesn't contain a valid partition table
Disk /dev/sdd: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1         100      803218+  83  Linux
/dev/sdd2             101        1000     7229250   83  Linux

查看iscsi运行情况

[root@xifenfei ~]# rpm -aq|grep iscsi
iscsi-initiator-utils-6.2.0.872-10.0.1.el5
[root@xifenfei ~]#  chkconfig --list |grep iscsi
iscsi           0:off   1:off   2:on    3:on    4:on    5:on    6:off
iscsid          0:off   1:off   2:off   3:on    4:on    5:on    6:off
[root@xifenfei ~]# ps -ef|grep iscs
root      2753     2  0 Jun21 ?        00:00:00 [iscsi_eh]
root     15793     1  0 09:08 ?        00:00:00 brcm_iscsiuio
root     15800     1  0 09:08 ?        00:00:00 iscsid
root     15802     1  0 09:08 ?        00:00:00 iscsid
root     19533 15269  0 10:11 pts/1    00:00:00 grep iscs

配置iscsi存储

[root@xifenfei ~]# iscsiadm -m discovery -t sendtargets -p 192.168.1.254:3260
192.168.1.254:3260,1 iqn.2006-01.com.openfiler:tsn.32b32087937b
[root@xifenfei ~]# iscsiadm -m node –T iqn.2006-01.com.openfiler:tsn.32b32087937b -p 192.168.1.254:3260 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:tsn.32b32087937b, portal: 192.168.1.254,3260]
Login to [iface: default, target: iqn.2006-01.com.openfiler:tsn.32b32087937b, portal: 192.168.1.254,3260] successful.
[root@xifenfei ~]# iscsiadm -m node –T iqn.2006-01.com.openfiler:tsn.32b32087937b -p 192.168.1.254:3260
>--op update -n node.startup -v automatic

当前包含磁盘

[root@xifenfei ~]# fdisk -l
Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        2355    18916506   83  Linux
/dev/sda2            2356        2610     2048287+  82  Linux swap / Solaris
Disk /dev/sdb: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1        2610    20964793+  83  Linux
Disk /dev/sdc: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdc doesn't contain a valid partition table
Disk /dev/sdd: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1         100      803218+  83  Linux
/dev/sdd2             101        1000     7229250   83  Linux
Disk /dev/sde: 1073 MB, 1073741824 bytes
34 heads, 61 sectors/track, 1011 cylinders
Units = cylinders of 2074 * 512 = 1061888 bytes
Disk /dev/sde doesn't contain a valid partition table
Disk /dev/sdf: 1073 MB, 1073741824 bytes
34 heads, 61 sectors/track, 1011 cylinders
Units = cylinders of 2074 * 512 = 1061888 bytes
Disk /dev/sdf doesn't contain a valid partition table
Disk /dev/sdg: 1073 MB, 1073741824 bytes
34 heads, 61 sectors/track, 1011 cylinders
Units = cylinders of 2074 * 512 = 1061888 bytes
Disk /dev/sdg doesn't contain a valid partition table

卸载iscsi存储

[root@xifenfei ~]# iscsiadm -m node --logoutall=all
Logging out of session [sid: 3, target: iqn.2006-01.com.openfiler:tsn.32b32087937b, portal: 192.168.1.254,3260]
Logout of [sid: 3, target: iqn.2006-01.com.openfiler:tsn.32b32087937b, portal: 192.168.1.254,3260] successful.
[root@xifenfei ~]# iscsiadm -m node --op delete --targetname iqn.2006-01.com.openfiler:tsn.32b32087937b

当前包含磁盘

[root@xifenfei ~]# fdisk -l
Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        2355    18916506   83  Linux
/dev/sda2            2356        2610     2048287+  82  Linux swap / Solaris
Disk /dev/sdb: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1        2610    20964793+  83  Linux
Disk /dev/sdc: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdc doesn't contain a valid partition table
Disk /dev/sdd: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1         100      803218+  83  Linux
/dev/sdd2             101        1000     7229250   83  Linux

iscsi操作总结

增加iscsi存储
(1)发现iscsi存储:iscsiadm -m discovery -t st -p ISCSI_IP
(2)查看iscsi发现记录:iscsiadm -m node
(3)登录iscsi存储:iscsiadm -m node -T LUN_NAME -p ISCSI_IP -l
(4)开机自动: iscsiadm -m node –T LUN_NAME -p ISCSI_IP --op update -n node.startup -v automatic
删除iscsi存储
(1)登出iscsi存储 iscsiadm -m node -T LUN_NAME -p ISCSI_IP -u
(2)对出iscsi所有登录 iscsiadm -m node --logoutall=all
(3)删除iscsi发现记录:iscsiadm -m node -o delete -T LUN_NAME -p ISCSI_IP
登入需验证码的节点
(1)开启认证
iscsiadm -m node -T LUN_NAME -o update --name node.session.auth.authmethod --value=CHAP
*.使用-o同--op
(2)添加用户
iscsiadm -m node -T LUN_NAME --op update --name node.session.auth.username --value=[用户名]
(3)添加密码
iscsiadm –m node –T LUN_NAME –op update –name node.session.auth.password –value=[密码]

关于linux中oracle用户进程占用内存猜测

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:关于linux中oracle用户进程占用内存猜测

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

本文是针对linux下面显示oracle用户进程占用大量内存的一个猜想性说明,希望各位专家和我一起继续探讨该问题
ORACLE用户进程占用私有内存分析
top命令结果

[oracle@ora02 31500]$ top -c
top - 12:13:16 up 254 days, 12:14,  2 users,  load average: 1.53, 1.62, 1.33
Tasks: 293 total,   3 running, 290 sleeping,   0 stopped,   0 zombie
Cpu(s):  3.4% us,  0.8% sy,  0.0% ni, 94.7% id,  1.1% wa,  0.0% hi,  0.0% si
Mem:   4147172k total,  4129724k used,    17448k free,    20348k buffers
Swap:  4192956k total,   217772k used,  3975184k free,  2575320k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12505 oracle    17   0 1561m 972m 970m R  100 24.0   3:45.89 oracletxzldb (LOCAL=NO)
12475 oracle    16   0 1561m 931m 929m S    4 23.0   4:08.65 oracletxzldb (LOCAL=NO)
12477 oracle    16   0 1561m 945m 942m S    2 23.3   3:36.92 oracletxzldb (LOCAL=NO)
12479 oracle    16   0 1561m 944m 941m S    2 23.3   4:27.29 oracletxzldb (LOCAL=NO)
12483 oracle    16   0 1561m 939m 937m R    2 23.2   3:49.00 oracletxzldb (LOCAL=NO)
12493 oracle    16   0 1561m 958m 955m S    2 23.7   3:35.35 oracletxzldb (LOCAL=NO)

补充说明:

VIRT	进程使用的虚拟内存总量
RES	进程使用的、未被换出的物理内存大小
SHR	共享内存大小

通过这里可以得出几个信息
12505进程实际使用内存情况:972-970=2M
12505进程%MEM:972*1024/4147172=0.24000162
12505进程在数据库中占用内存

SQL> select PGA_ALLOC_MEM/1024/1024 MEM from v$process where spid=12505;
       MEM
----------
1.90997028

通过这里可以看出12505进程实际上是占用了970M的共享内存,占用2M的PGA内存

ORACLE用户进程占用共享内存分析
分析12505进程的当前进程状态

[oracle@ora02 31500]$ cd /proc/12505
[oracle@ora02 12505]$ cat status
Name:   oracle
State:  S (sleeping)
SleepAVG:       92%
Tgid:   12505
Pid:    12505
PPid:   1
TracerPid:      0
Uid:    501     501     501     501
Gid:    502     502     502     502
FDSize: 32
Groups: 501 502
VmSize:  1599004 kB    <--使用内存(包括虚拟内存)总量1599004/1024=1561.52734和top中VIRT基本吻合
VmLck:         0 kB
VmRSS:    996132 kB    <--实际使用内存996132/1024 =972.785156和top看到RES基本吻合
VmData:      832 kB
VmStk:       120 kB
VmExe:     37307 kB
VmLib:      4641 kB
StaBrk: 0ad6e000 kB
Brk:    0adf2000 kB
StaStk: bffff850 kB
ExecLim:        ffffffff
Threads:        1
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000006005203
SigCgt: 00000001c9802cfc
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000

pmap命令分析

[oracle@ora02 12505]$ pmap -d 12505
12505:   oracletxzldb (LOCAL=NO)
Address   Kbytes Mode  Offset           Device    Mapping
0013f000      88 r-x-- 0000000000000000 008:00002 ld-2.3.4.so
00155000       4 r-x-- 0000000000015000 008:00002 ld-2.3.4.so
00156000       4 rwx-- 0000000000016000 008:00002 ld-2.3.4.so
00159000    1176 r-x-- 0000000000000000 008:00002 libc-2.3.4.so
0027f000       8 r-x-- 0000000000125000 008:00002 libc-2.3.4.so
00281000       8 rwx-- 0000000000127000 008:00002 libc-2.3.4.so
00283000       8 rwx-- 0000000000283000 000:00000   [ anon ]
00287000     132 r-x-- 0000000000000000 008:00002 libm-2.3.4.so
002a8000       4 r-x-- 0000000000020000 008:00002 libm-2.3.4.so
002a9000       4 rwx-- 0000000000021000 008:00002 libm-2.3.4.so
002ac000       8 r-x-- 0000000000000000 008:00002 libdl-2.3.4.so
002ae000       4 r-x-- 0000000000001000 008:00002 libdl-2.3.4.so
002af000       4 rwx-- 0000000000002000 008:00002 libdl-2.3.4.so
003b5000      56 r-x-- 0000000000000000 008:00002 libpthread-2.3.4.so
003c3000       4 r-x-- 000000000000d000 008:00002 libpthread-2.3.4.so
003c4000       4 rwx-- 000000000000e000 008:00002 libpthread-2.3.4.so
003c5000       8 rwx-- 00000000003c5000 000:00000   [ anon ]
00ba4000      72 r-x-- 0000000000000000 008:00002 libnsl-2.3.4.so
00bb6000       4 r-x-- 0000000000011000 008:00002 libnsl-2.3.4.so
00bb7000       4 rwx-- 0000000000012000 008:00002 libnsl-2.3.4.so
00bb8000       8 rwx-- 0000000000bb8000 000:00000   [ anon ]
08048000   37308 r-x-- 0000000000000000 0fd:00001 oracle
0a4b7000    8804 rwx-- 000000000246f000 0fd:00001 oracle
0ad50000     648 rwx-- 000000000ad50000 000:00000   [ anon ]
50000000 1540096 rwxs- 0000000000000000 000:00006   [ shmid=0x9000e ]
ae000000       4 r-xs- 000000005e000000 000:00006   [ shmid=0x9000e ]
ae001000    1156 rwxs- 000000005e001000 000:00006   [ shmid=0x9000e ]
ae122000       4 r-xs- 000000005e122000 000:00006   [ shmid=0x9000e ]
ae123000    2932 rwxs- 000000005e123000 000:00006   [ shmid=0x9000e ]
b79d4000    1024 rwx-- 00000000000f4000 000:0000d zero
b7ad4000     512 rwx-- 0000000000074000 000:0000d zero
b7b54000     512 rwx-- 0000000000000000 000:0000d zero
b7bd4000      36 r-x-- 0000000000000000 008:00002 libnss_files-2.3.4.so
b7bdd000       4 r-x-- 0000000000008000 008:00002 libnss_files-2.3.4.so
b7bde000       4 rwx-- 0000000000009000 008:00002 libnss_files-2.3.4.so
b7bdf000     148 rwx-- 00000000b7bdf000 000:00000   [ anon ]
b7c04000    2940 r-x-- 0000000000000000 0fd:00001 libjox9.so
b7ee3000    1088 rwx-- 00000000002de000 0fd:00001 libjox9.so
b7ff3000       8 rwx-- 00000000b7ff3000 000:00000   [ anon ]
b7ff5000       4 r-x-- 0000000000000000 0fd:00001 libskgxn9.so
b7ff6000       8 rwx-- 0000000000000000 0fd:00001 libskgxn9.so
b7ff8000       4 r-x-- 0000000000000000 0fd:00001 libskgxp9.so
b7ff9000       4 --x-- 0000000000001000 0fd:00001 libskgxp9.so
b7ffa000       4 rwx-- 0000000000001000 0fd:00001 libskgxp9.so
b7ffb000       4 r-x-- 0000000000000000 0fd:00001 libodmd9.so
b7ffc000       4 rwx-- 0000000000000000 0fd:00001 libodmd9.so
b7ffd000       4 r-x-- 0000000000000000 008:00002 libcwait.so
b7ffe000       4 rwx-- 0000000000000000 008:00002 libcwait.so
b7fff000       4 rwx-- 00000000b7fff000 000:00000   [ anon ]
bffe2000     120 rwx-- 00000000bffe2000 000:00000   [ stack ]
ffffe000       4 ----- 0000000000000000 000:00000   [ anon ]
mapped: 1599008K    writeable/private: 12944K    shared: 1544192K

补充说明:

mapped :映射到文件的内存数量
writable/private :进程所占用的私有地址空间数量
shared :与其它进程共享的地址空间数量

ipcs 命令

[oracle@ora02 12505]$ ipcs -m
------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x0000cace 65536      root      666        2          0
0x4d4e5251 98305      root      644        330752     0
0x55315352 131074     root      666        4096       0
0x44525354 163843     root      644        632832     0
0x53494152 196612     root      644        1024       0
0x00005643 229381     root      666        1024       1
0x00005654 262150     root      666        1024       1
0x992ad3dc 589838     oracle    640        1581252608 595

结合pmap和ipcs分析(shmid=0x9000e)

SQL> select to_number('9000e','xxxxxxxx') from dual;
TO_NUMBER('9000E','XXXXXXXX')
-----------------------------
                       589838
SQL> select 1540096+4+1156+4+2932 from dual;
1540096+4+1156+4+2932
---------------------
              1544192
SQL> select 1581252608/1024 from dual;
1581252608/1024
---------------
        1544192

通过这里可以得出12505进程中的共享内存,主要是数据库SGA中的共享内存

补充猜测

SQL> show sga;
Total System Global Area 1561926292 bytes
Fixed Size                   453268 bytes
Variable Size             603979776 bytes
Database Buffers          956301312 bytes
Redo Buffers                1191936 bytes
SQL> select 1561926292/1024 from dual;
1561926292/1024
---------------
     1525318.64

这里显示数据库配置的sga比ipcs中配置共享内存段小,但是进程在分配总的共享内存时候,使用的是ipcs设定的内存段大小,实际使用的内存可能是sga设置大小(未得到权威资料)