linux 7(redhat,oracle linux,centos)中使用udev

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:linux 7(redhat,oracle linux,centos)中使用udev

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

慢慢的linux 7的使用人越来越多了,但是linux 7相对于5和6的版本,变动确实比较大,本文主要描写在linux 7中如何实现udev,实现设备持久化,权限和所属组的修改
linux版本

Oracle Linux Server release 7.1
[root@www.xifenfei.com ~]# uname -a
Linux www.xifenfei.com 3.10.0-229.el7.x86_64 #1 SMP Fri Mar 6 04:05:24 PST 2015 x86_64 x86_64 x86_64 GNU/Linux

VMware Workstation中显示uuid需要在vmx文件中增加

disk.enableUUID = "TRUE"

查看磁盘分区

[root@www.xifenfei.com ~]# fdisk -l
Disk /dev/sdb: 21.5 GB, 21474836480 bytes, 41943040 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0xf60fe217
   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1            2048     2099199     1048576   83  Linux
Disk /dev/sda: 42.9 GB, 42949672960 bytes, 83886080 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000bce7c
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048     4204543     2101248   8e  Linux LVM
/dev/sda2   *     4204544    79702015    37748736   83  Linux
Disk /dev/sdc: 32.2 GB, 32212254720 bytes, 62914560 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/mapper/ol-swap: 2147 MB, 2147483648 bytes, 4194304 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

查看磁盘uuid

[root@www.xifenfei.com ~]# /usr/lib/udev/scsi_id -g -u /dev/sdb1
36000c29e91831cedbe69afe6cc08daf7
[root@www.xifenfei.com ~]# /usr/lib/udev/scsi_id -g -u /dev/sdc
36000c292495e9d9de6f21640cc7b53b9

udev绑定

[root@www.xifenfei.com ~]# more /etc/udev/rules.d/99-my-asmdevices.rules
KERNEL=="sd*[!0-9]", ENV{DEVTYPE}=="disk", SUBSYSTEM=="block", PROGRAM=="/usr/lib/udev/scsi_id -g -u -d $devnode",
 RESULT=="36000c292495e9d9de6f21640cc7b53b9", RUN+="/bin/sh -c 'mknod /dev/xifenfei-sdc b $major $minor;
chown oracle:dba /dev/xifenfei-sdc; chmod 0660 /dev/xifenfei-sdc'"
KERNEL=="sd?1", SUBSYSTEM=="block", PROGRAM=="/lib/udev/scsi_id -g -u -d /dev/$parent",
RESULT=="36000c29e91831cedbe69afe6cc08daf7", SYMLINK+="xifenfei-sdb1", OWNER="oracle", GROUP="dba", MODE="0660"

绑定结果

[root@www.xifenfei.com ~]# ls -l /dev/xifenfei-*
lrwxrwxrwx. 1 root   root     4 Aug  7 22:49 /dev/xifenfei-sdb1 -> sdb1
brw-rw----. 1 oracle dba  8, 32 Aug  7 22:36 /dev/xifenfei-sdc
[root@www.xifenfei.com ~]# ls -l /dev/sdb1
brw-rw----. 1 oracle dba 8, 17 Aug  7 22:49 /dev/sdb1

udev只修改磁盘权限

[root@www.xifenfei.com ~]# fdisk /dev/sdb
Welcome to fdisk (util-linux 2.23.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): n
Partition type:
   p   primary (1 primary, 0 extended, 3 free)
   e   extended
Select (default p): p
Partition number (2-4, default 2):
First sector (2099200-41943039, default 2099200):
Using default value 2099200
Last sector, +sectors or +size{K,M,G} (2099200-41943039, default 41943039): +1G
Partition 2 of type Linux and of size 1 GiB is set
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
[root@www.xifenfei.com ~]# more /etc/udev/rules.d/99-my-asmdevices.rules
KERNEL=="sd?2", SUBSYSTEM=="block", PROGRAM=="/lib/udev/scsi_id -g -u -d /dev/$parent",
 RESULT=="36000c29e91831cedbe69afe6cc08daf7",  OWNER="oracle", GROUP="dba", MODE="0660"
[root@www.xifenfei.com ~]# /sbin/udevadm trigger --type=devices --action=change
[root@www.xifenfei.com ~]# ls -l /dev/sdb2
brw-rw----. 1 oracle dba 8, 18 Aug  7 23:14 /dev/sdb2

这里可以发现在linux 7中使用了两种方法绑定udev,一种是真实生成udev设备,另外一种是通过软连接实现.感谢lunar(Lunar的oracle实验室)在linux 7学习中的帮助

使用losetup实现linux普通文件做asm disk

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:使用losetup实现linux普通文件做asm disk

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

上一篇文章《使用_asm_allow_only_raw_disks实现普通文件做asm disk》中已经介绍使用_asm_allow_only_raw_disks参数使得oracle asm可以使用文件作为asm disk,这篇文章介绍在linux中还可以通过losetup来实现文件系统模拟磁盘实现使用文件系统做asm disk的效果
通过dd构造文件

[oracle@xifenfei ~]$ mkdir /u01/oracle/oradata/asmdisk
[oracle@xifenfei ~]$ dd if=/dev/zero of=/u01/oracle/oradata/asmdisk/xifenfei01.dd bs=10240k count=100
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 21.9158 seconds, 47.8 MB/s
[oracle@xifenfei ~]$ dd if=/dev/zero of=/u01/oracle/oradata/asmdisk/xifenfei02.dd bs=10240k count=100
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 22.392 seconds, 46.8 MB/s
[oracle@xifenfei ~]$ ls -lh /u01/oracle/oradata/asmdisk/
total 3.0G
-rw-r--r-- 1 oracle oinstall 1000M Feb 27 22:58 xifenfei01.dd
-rw-r--r-- 1 oracle oinstall 1000M Feb 27 23:00 xifenfei02.dd

使用losetup模拟磁盘

[root@xifenfei asmdisk]# ls -l /dev/lo
log    loop0  loop1  loop2  loop3  loop4  loop5  loop6  loop7
[root@xifenfei asmdisk]# losetup /dev/loop1 xifenfei01.dd
[root@xifenfei asmdisk]# losetup /dev/loop2 xifenfei02.dd

使用raw实现磁盘转换为裸设备

[root@xifenfei asmdisk]# raw  /dev/raw/raw10 /dev/loop1
/dev/raw/raw10: bound to major 7, minor 1
[root@xifenfei asmdisk]# raw  /dev/raw/raw11 /dev/loop2
/dev/raw/raw11: bound to major 7, minor 2
[root@xifenfei asmdisk]# ls -l /dev/raw/raw1[0-1]
crw------- 1 root root 162, 10 Feb 27 23:16 /dev/raw/raw10
crw------- 1 root root 162, 11 Feb 27 23:16 /dev/raw/raw11
[root@xifenfei asmdisk]# chown oracle.dba /dev/raw/raw1[0-1]
[root@xifenfei asmdisk]# ls -l /dev/raw/raw1[0-1]
crw------- 1 oracle dba 162, 10 Feb 27 23:16 /dev/raw/raw10
crw------- 1 oracle dba 162, 11 Feb 27 23:16 /dev/raw/raw11

创建磁盘组

[oracle@xifenfei ~]$ export ORACLE_SID=+ASM
[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Thu Feb 27 23:19:28 2014
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL>  create diskgroup xff external redundancy disk '/dev/raw/raw10','/dev/raw/raw11';
Diskgroup created.
SQL> select group_number,name from v$asm_diskgroup;
GROUP_NUMBER NAME
------------ ------------------------------------------------------------
           1 DATA
           2 XFF
SQL> select path,TOTAL_MB from v$asm_disk where group_number=2;
PATH                   TOTAL_MB
-------------------- ----------
/dev/raw/raw11             1000
/dev/raw/raw10             1000

通过kfed验证asm disk是数据文件

[oracle@xifenfei tmp]$ kfed read /dev/raw/raw10|grep XFF
kfdhdb.dskname:                XFF_0000 ; 0x028: length=8
kfdhdb.grpname:                     XFF ; 0x048: length=3
kfdhdb.fgname:                 XFF_0000 ; 0x068: length=8
[oracle@xifenfei tmp]$ kfed read /dev/raw/raw11|grep XFF
kfdhdb.dskname:                XFF_0001 ; 0x028: length=8
kfdhdb.grpname:                     XFF ; 0x048: length=3
kfdhdb.fgname:                 XFF_0001 ; 0x068: length=8
[oracle@xifenfei tmp]$ kfed read /u01/oracle/oradata/asmdisk/xifenfei01.dd |grep XFF
kfdhdb.dskname:                XFF_0000 ; 0x028: length=8
kfdhdb.grpname:                     XFF ; 0x048: length=3
kfdhdb.fgname:                 XFF_0000 ; 0x068: length=8
[oracle@xifenfei tmp]$ kfed read /u01/oracle/oradata/asmdisk/xifenfei02.dd |grep XFF
kfdhdb.dskname:                XFF_0001 ; 0x028: length=8
kfdhdb.grpname:                     XFF ; 0x048: length=3
kfdhdb.fgname:                 XFF_0001 ; 0x068: length=8

通过kfed命令,确定asm本质是用了dd出来的数据文件做asm disk.

Linux 7 新命令之—lscpu和systemctl

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:Linux 7 新命令之—lscpu和systemctl

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

redhat 7 系列发布已经有一段时间了,最近抽时间看了下官方文档,对比了一些命令,对于其中比较关注的进行了记录,本篇主要是列出来了服务的管理改进,引进了systemctl管理和一个非常方便看cpu信息的命令lscpu
系统版本

[root@em12cdb ~]# uname -a
Linux em12cdb 3.8.13-55.1.6.el7uek.x86_64 #2 SMP Wed Feb 11 14:18:22 PST 2015 x86_64 x86_64 x86_64 GNU/Linux
[root@em12cdb ~]# more /etc/oracle-release
Oracle Linux Server release 7.1

查看cpu信息

[root@em12cdb ~]# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 60
Model name:            Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
Stepping:              3
CPU MHz:               3997.739
BogoMIPS:              7995.47
Hypervisor vendor:     VMware
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K
NUMA node0 CPU(s):     0,1

systemctl命令管理服务和开机启动
以前主要是通过/etc/init.d/或者service命令管理服务,通过chkconfig管理是否开机启动

--查看某个服务状态
[root@em12cdb ~]# systemctl status crond.service
crond.service - Command Scheduler
   Loaded: loaded (/usr/lib/systemd/system/crond.service; enabled)
   Active: active (running) since Mon 2015-07-27 11:59:16 CST; 1 months 7 days ago
 Main PID: 1245 (crond)
   CGroup: /system.slice/crond.service
           └─1245 /usr/sbin/crond -n
Jul 27 11:59:16 em12cdb systemd[1]: Started Command Scheduler.
Jul 27 11:59:16 em12cdb crond[1245]: (CRON) INFO (RANDOM_DELAY will be scaled with factor 33% if used.)
Jul 27 11:59:17 em12cdb crond[1245]: (CRON) INFO (running with inotify support)
--停止某个服务
[root@em12cdb ~]# systemctl stop crond.service
[root@em12cdb ~]# systemctl status crond.service
crond.service - Command Scheduler
   Loaded: loaded (/usr/lib/systemd/system/crond.service; enabled)
   Active: inactive (dead) since Thu 2015-09-03 00:27:55 CST; 2s ago
 Main PID: 1245 (code=exited, status=0/SUCCESS)
Jul 27 11:59:16 em12cdb systemd[1]: Started Command Scheduler.
Jul 27 11:59:16 em12cdb crond[1245]: (CRON) INFO (RANDOM_DELAY will be scaled with factor 33% if used.)
Jul 27 11:59:17 em12cdb crond[1245]: (CRON) INFO (running with inotify support)
Sep 03 00:27:55 em12cdb systemd[1]: Stopping Command Scheduler...
Sep 03 00:27:55 em12cdb systemd[1]: Stopped Command Scheduler.
--启动某个服务
[root@em12cdb ~]# systemctl start crond.service
[root@em12cdb ~]# systemctl status crond.service
crond.service - Command Scheduler
   Loaded: loaded (/usr/lib/systemd/system/crond.service; enabled)
   Active: active (running) since Thu 2015-09-03 00:28:08 CST; 1s ago
 Main PID: 7294 (crond)
   CGroup: /system.slice/crond.service
           └─7294 /usr/sbin/crond -n
Sep 03 00:28:08 em12cdb systemd[1]: Started Command Scheduler.
Sep 03 00:28:08 em12cdb crond[7294]: (CRON) INFO (RANDOM_DELAY will be scaled with factor 56% if used.)
Sep 03 00:28:09 em12cdb crond[7294]: (CRON) INFO (running with inotify support)
Sep 03 00:28:09 em12cdb crond[7294]: (CRON) INFO (@reboot jobs will be run at computer's startup.)
--重启某个服务
[root@em12cdb ~]# systemctl restart crond.service
[root@em12cdb ~]# systemctl status crond.service
crond.service - Command Scheduler
   Loaded: loaded (/usr/lib/systemd/system/crond.service; enabled)
   Active: active (running) since Thu 2015-09-03 00:28:24 CST; 2s ago
 Main PID: 7323 (crond)
   CGroup: /system.slice/crond.service
           └─7323 /usr/sbin/crond -n
Sep 03 00:28:24 em12cdb systemd[1]: Starting Command Scheduler...
Sep 03 00:28:24 em12cdb systemd[1]: Started Command Scheduler.
Sep 03 00:28:24 em12cdb crond[7323]: (CRON) INFO (RANDOM_DELAY will be scaled with factor 61% if used.)
Sep 03 00:28:24 em12cdb crond[7323]: (CRON) INFO (running with inotify support)
Sep 03 00:28:24 em12cdb crond[7323]: (CRON) INFO (@reboot jobs will be run at computer's startup.)
--检查某个服务是否开机启动
[root@em12cdb ~]# systemctl is-enabled crond.service
enabled
--禁止某个服务开机启动
[root@em12cdb ~]# systemctl disable crond.service
rm '/etc/systemd/system/multi-user.target.wants/crond.service'
[root@em12cdb ~]# systemctl is-enabled crond.service
disabled
--查看所有服务开机启动情况(这里使用了grep便于说明)
[root@em12cdb ~]# systemctl list-unit-files --type service|grep cron
crond.service                               disabled
[root@em12cdb ~]# systemctl enable crond.service
ln -s '/usr/lib/systemd/system/crond.service' '/etc/systemd/system/multi-user.target.wants/crond.service'
[root@em12cdb ~]# systemctl list-unit-files --type service|grep cron
crond.service                               enabled

systemctl chkconfig 对比


systemctl命令修改启动模式
以前版本中,直接通过vi修改/etc/inittab文件

--多用户图形界面
[root@em12cdb ~]# systemctl get-default
graphical.target
--多用户字符界面
[root@em12cdb ~]# systemctl set-default multi-user.target
rm '/etc/systemd/system/default.target'
ln -s '/usr/lib/systemd/system/multi-user.target' '/etc/systemd/system/default.target'
[root@em12cdb ~]# systemctl get-default
multi-user.target

systemd runlevel


注意系统bug—linux在E5、E5 V2、E7 V2 cpu之上的bug 765720

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:注意系统bug—linux在E5、E5 V2、E7 V2 cpu之上的bug 765720

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天晚上群里面兄弟说了一个linux 6上面bug,会导致系统在运行200天以上(hardware uptime),然后进行热重启后,可能在几分钟或者几个小时内出发该bug,导致系统异常.

主要影响条件为:
Red Hat Enterprise Linux 6.1 (kernel-2.6.32-131.26.1.el6 and newer)
Red Hat Enterprise Linux 6.2 (kernel-2.6.32-220.4.2.el6 and newer)
Red Hat Enterprise Linux 6.3 (kernel-2.6.32-279 series)
Red Hat Enterprise Linux 6.4 (kernel-2.6.32-358 series)
Any Intel® Xeon® E5, Intel® Xeon® E5 v2, or Intel® Xeon® E7 v2 series processor
从这里可以看出来该问题主要影响E5、E5 V2、E7 V2 cpu上的redhat 6.1-6.4版本,在6.5版本中修复,具体参考:bug 765720
另外对已ORACLE Linux,如果使用EL Kernel影响和redhat一致,如果使用Unbreakable Enterprise Kernel则在6.2版本中进行了修复该问题。
MOS上类似文章:Oracle Linux 6 RHCK system hang: processes blocked in ext4_file_open(), pick_next_task_fair()

补充说明:
1. 在Red Hat/OEL 5.x版本中不存在。
2. 在32和64位操作系统都有可能发生
3. 鉴于该bug短期内无法修复,而且真的发生了,考虑冷重启主机,临时规避

再次提醒:系统版本选定也很重要,大家在选择Linux版本之时尽量选择避开该bug(el kernel 6.5及其以后版本,uek kernel 6.2及其以后版本)。个人倾向:如果是部署ORACLE db,而且还是redhat系列Linux,更加倾向OEL(省事,相信Oracle)

记录一次rm -rf 删除数据文件异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:记录一次rm -rf 删除数据文件异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

因为人员离职闹得不愉快,系统工程师离职后,由于公司未及时关闭其vpn,数据库服务器(Linux 6.5 Oracle 11.2.0.1)帐号未及时被修改,最后直接上去rm ORACLE_BASE给干掉,悲剧的是ORADATA目录也在里面,更加悲剧的是所有数据文件都在里面.也就是说数据库彻底被删除,而且没有任何备份.朋友咨询了我,让我给予支持.最后比较幸运,文件没有被覆盖,inode都还在,通过extundelete顺利恢复所有数据文件,控制文件,redo文件(extundelete恢复Linux被删除文件),数据库顺利打开,实现0丢失,算是一次完美的恢复

[root@DB1 tmp]# tar xvf extundelete-0.2.4.tar
extundelete-0.2.4/
extundelete-0.2.4/acinclude.m4
extundelete-0.2.4/missing
extundelete-0.2.4/autogen.sh
extundelete-0.2.4/aclocal.m4
extundelete-0.2.4/configure
extundelete-0.2.4/LICENSE
extundelete-0.2.4/README
extundelete-0.2.4/install-sh
extundelete-0.2.4/config.h.in
extundelete-0.2.4/src/
extundelete-0.2.4/src/extundelete.cc
extundelete-0.2.4/src/block.h
extundelete-0.2.4/src/kernel-jbd.h
extundelete-0.2.4/src/insertionops.cc
extundelete-0.2.4/src/block.c
extundelete-0.2.4/src/cli.cc
extundelete-0.2.4/src/extundelete-priv.h
extundelete-0.2.4/src/extundelete.h
extundelete-0.2.4/src/jfs_compat.h
extundelete-0.2.4/src/Makefile.in
extundelete-0.2.4/src/Makefile.am
extundelete-0.2.4/configure.ac
extundelete-0.2.4/depcomp
extundelete-0.2.4/Makefile.in
extundelete-0.2.4/Makefile.am
[root@DB1 tmp]# cd extundelete-0.2.4
[root@DB1 extundelete-0.2.4]# ./configure
Configuring extundelete 0.2.4
Writing generated files to disk
[root@DB1 extundelete-0.2.4]# make && make install
make -s all-recursive
Making all in src
Making install in src
  /usr/bin/install -c extundelete '/usr/local/bin'
[root@DB1 extundelete-0.2.4]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       244G   11G  221G   5% /
tmpfs            16G   72K   16G   1% /dev/shm
/dev/sda1       190M   62M  119M  35% /boot
/dev/sdb1       2.0T   71M  1.9T   1% /home
[root@DB1 extundelete-0.2.4]# umount /dev/sdb1
umount: /home: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
/home:                3914c(oracle)  8372c(oracle)
[root@DB1 extundelete-0.2.4]# kill -9 3914
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
/home:                8372c(oracle)
[root@DB1 extundelete-0.2.4]# kill -9 8372
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
[root@DB1 extundelete-0.2.4]# umount /dev/sdb1
[root@DB1 extundelete-0.2.4]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       244G   11G  221G   5% /
tmpfs            16G   72K   16G   1% /dev/shm
/dev/sda1       190M   62M  119M  35% /boot
[root@DB1 extundelete-0.2.4]# extundelete /dev/sdb1 --restore-all
NOTICE: Extended attributes are not restored.
Loading filesystem metadata ... 16384 groups loaded.
Loading journal descriptors ... 26542 descriptors loaded.
Searching for recoverable inodes in directory / ...
18896 recoverable inodes found.
Looking through the directory structure for deleted files ...
2 recoverable inodes still lost.
Unable to restore inode 43778050 (file.43778050): Space has been reallocated.
[root@DB1 extundelete-0.2.4]# ls
acinclude.m4  autogen.sh  config.h.in  config.status  configure.ac  install-sh  Makefile     Makefile.in
aclocal.m4    config.h    config.log   configure      depcomp       LICENSE     Makefile.am  missing
[root@DB1 extundelete-0.2.4]# cd RECOVERED_FILES/
[root@DB1 RECOVERED_FILES]# ls
app  file.43778051  oracle  oraInventory
[root@DB1 RECOVERED_FILES]# cd app
[root@DB1 app]# ls
admin  cfgtoollogs  diag  oracle  oradata  orcl  ORCL
[root@DB1 app]# cd oradata
[root@DB1 oradata]# ls
orcl
[root@DB1 oradata]# cd orcl
[root@DB1 orcl]# ls
control01.ctl  redo01.log  redo02.log  redo03.log  sysaux01.dbf  system01.dbf  undotbs01.dbf  users01.dbf
[root@DB1 orcl]# ls -ltr
total 2908776
-rw-r--r--. 1 root root  734011392 Nov 18 02:06 system01.dbf
-rw-r--r--. 1 root root 1069555712 Nov 18 02:06 sysaux01.dbf
-rw-r--r--. 1 root root  120594432 Nov 18 02:06 undotbs01.dbf
-rw-r--r--. 1 root root  887365632 Nov 18 02:06 users01.dbf
-rw-r--r--. 1 root root    9748480 Nov 18 02:06 control01.ctl
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo01.log
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo02.log
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo03.log
[root@DB1 orcl]#

再次提醒各位:数据库备份重于一切,防天灾的同时还要防人灾,也希望圈子里面以后不要听到类似故障.

multipath实现设备用户组设置

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:multipath实现设备用户组设置

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

现在的Linux系统中,很多都会使用系统自带的multipath多路径软件,在以前的版本中,我们一般通过multipath+udev或者multipath+rc.local来实现多路径和权限设置,而在redhat 5.3-5.11的版本中multipath就直接可以实现多路径聚合、设备持久化、用户组设置
操作系统版本

[root@rac1 dev]# uname -r
2.6.39-300.26.1.el5uek
[root@rac1 dev]# more /etc/issue
Oracle Linux Server release 5.9
Kernel \r on an \m

fdisk记录

[root@rac1 dev]# fdisk -l
…………
Disk /dev/sdh: 134.2 GB, 134217728000 bytes
255 heads, 63 sectors/track, 16317 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
Disk /dev/sdi: 33.5 GB, 33554432000 bytes
64 heads, 32 sectors/track, 32000 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
   Device Boot      Start         End      Blocks   Id  System

multipath包
检查安装multipath相关包(该版本系统默认安装)

[root@rac1 dev]# rpm -aq|grep mapper
device-mapper-multipath-libs-0.4.9-56.0.3.el5
device-mapper-event-1.02.67-2.el5
device-mapper-1.02.67-2.el5
device-mapper-multipath-0.4.9-56.0.3.el5

获取wwid值

[root@rac1 dev]# /sbin/scsi_id -g -u -s /block/sdh
14f504e46494c45527049754962662d395751372d68356743
[root@rac1 dev]# /sbin/scsi_id -g -u -s /block/sdi
14f504e46494c4552484d486249782d464471382d354f4b58

获取uid和gid

[root@rac1 dev]# id grid
uid=1100(grid) gid=54321(oinstall) groups=54321(oinstall),1020(asmadmin),1021(asmdba)

multipath.conf配置

[root@rac1 dev]# vi /etc/multipath.conf
defaults {
user_friendly_names no
queue_without_daemon no
flush_on_last_del yes
max_fds max
}
blacklist {
devnode "^hd[a-z]"
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^cciss.*"
}
devices {
        device {
                vendor                  "OPNFILER "
                product                 "LUN"
                path_grouping_policy    group_by_prio
                features             "3 queue_if_no_path pg_init_retries 50"
                getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
                path_checker            tur
                path_selector           "round-robin 0"
                hardware_handler        "1 alua"
                failback               immediate
                rr_weight              uniform
                rr_min_io             128
        }
}
multipaths {
        multipath {
                wwid                    14f504e46494c45527049754962662d395751372d68356743  #wwid
                alias                   xifenfei128
                uid                     1100                                               #uid
                gid                     1020                                               #gid
        }
        multipath {
                wwid                    14f504e46494c4552484d486249782d464471382d354f4b58   #wwid
                alias                   xifenfei32
                uid                     1100                                                #uid
                gid                     1020                                                #gid
         }
}

启动multipath

[root@rac1 dev]# modprobe dm-multipath
[root@rac1 dev]# modprobe dm-round-robin
[root@rac1 dev]# chkconfig multipathd on
[root@rac1 dev]# service multipathd start
Starting multipathd daemon: [  OK  ]
[root@rac1 dev]# multipath -F
[root@rac1 dev]# multipath -v2
create: xifenfei128 (14f504e46494c45527049754962662d395751372d68356743) undef OPNFILER,VIRTUAL-DISK
size=125G features='0' hwhandler='0' wp=undef
`-+- policy='round-robin 0' prio=1 status=undef
  `- 3:0:0:9  sdh 8:112 undef ready  running
create: xifenfei32 (14f504e46494c4552484d486249782d464471382d354f4b58) undef OPNFILER,VIRTUAL-DISK
size=31G features='0' hwhandler='0' wp=undef
`-+- policy='round-robin 0' prio=1 status=undef
  `- 3:0:0:10 sdi 8:128 undef ready  running

查看生成多路径设备
注意设备名称、组、用户

[root@rac1 dev]# ls -l /dev/mapper/xifenfei*
brw-rw---- 1 grid asmadmin 252, 2 Jan  7 21:21 /dev/mapper/xifenfei128
brw-rw---- 1 grid asmadmin 252, 3 Jan  7 21:21 /dev/mapper/xifenfei32

补充Linux 6.x中udev设置所属组和权限
对于linux 6.x,multipath不能设置磁盘所属组和权限,可以通过udev进行实现,类似配置如下

[root@bxrac03 mapper]#cat 99-diskownership.rules
SUBSYSTEM!="block", GOTO="quickexit"
KERNEL!="dm-*", GOTO="quickexit"
PROGRAM=="/sbin/dmsetup info -c --noheadings -o name -m %m -j %M"
RESULT=="*ocr*", OWNER="grid", GROUP="oinstall", MODE="0660"
RESULT=="*oradata", OWNER="grid", GROUP="oinstall", MODE="0660"
RESULT=="*backup", OWNER="grid", GROUP="oinstall", MODE="0660"
LABEL="quickexit"

其中RESULT和dm的别名向匹配

shell脚本获得extents分布

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:shell脚本获得extents分布

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

比较深入看过dba_extents视图的朋友都知道,它得到extent的信息不是通过普通的存储在数据库中的基表获得,而是x$相关的表获得(x$表是数据库启动时候在内存中创建,不存在数据文件中),因为当数据库未正常启动,我们无法直接确定某个block是否在某个对象中.其实关于extent的信息都已经记录在了segment header的block中,通过dump该block记录的rdba信息,未转化为file_id和block_id,这里写shell脚本实现把segment header dump 内容转化为类似dba_extents记录,方便在某些不能open的库中分析某个异常block是否属于某个表

#! /bin/bash
dec2bin(){
  val_16=$1
  ((num=$val_16));
  val=`echo $num`
  local base=$2
  [ $val -eq 0 ] && bin=0
if [ $val -ge $base ]; then
    dec2bin $val $((base*2))
    if [ $val -ge $base ]; then
      val=$(($val-$base))
      bin=${bin}1
    else
      bin=${bin}0
    fi
  fi
  [ $base -eq 1 ] && printf  $bin
}
for i in `grep "length:" $1 |awk '{print $1 $3}'`;
do
rdba=`echo ${i:0:10}`
blocks=`echo ${i:10}`
echo -n "rdba:"$rdba"    "
bin2=`dec2bin $rdba  1`
len=`expr length $bin2`
len_gd=22
len_jg=`expr $len - $len_gd`
file_no_2=`echo ${bin2:0:$len_jg}`
((file_no=2#$file_no_2))
echo -n "file_id:"$file_no"    "
block_no_2=`echo ${bin2:$len_jg}`
((block_no=2#$block_no_2))
echo -n "block_id:"$block_no"    "
echo  "blocks:"$blocks
done;

trace文件中部分信息

  -----------------------------------------------------------------
   0x00400901  length: 7
   0x00402e10  length: 8
   0x00402e60  length: 8
   0x00402e68  length: 8
   0x00402ea0  length: 8
   0x00402f20  length: 8
   0x00402f48  length: 8
   0x00403050  length: 8
   0x00403180  length: 8
   0x00403b38  length: 8
   0x00404c48  length: 8
   0x00404c78  length: 8
   0x00404cf8  length: 8
   0x00404da8  length: 8
   0x00404db8  length: 8
   0x00404de8  length: 8
   0x00404e80  length: 128
   0x00405900  length: 128
   0x00406500  length: 128
   0x00406980  length: 128
   0x00407480  length: 128
   0x00407500  length: 128
   0x00407680  length: 128
   0x00407800  length: 128
   0x00407880  length: 128
   0x00407a00  length: 128
   0x00407a80  length: 128
   0x00407c80  length: 128
…………

执行结果

[oracle@xifenfei tmp]$ ./get_extent.sh /u01/app/oracle/diag/rdbms/cdb/cdb/trace/cdb_ora_29565.trc
rdba:0x00400901    file_id:1    block_id:2305     blocks:7
rdba:0x00402e10    file_id:1    block_id:11792     blocks:8
rdba:0x00402e60    file_id:1    block_id:11872     blocks:8
rdba:0x00402e68    file_id:1    block_id:11880     blocks:8
rdba:0x00402ea0    file_id:1    block_id:11936     blocks:8
rdba:0x00402f20    file_id:1    block_id:12064     blocks:8
rdba:0x00402f48    file_id:1    block_id:12104     blocks:8
rdba:0x00403050    file_id:1    block_id:12368     blocks:8
rdba:0x00403180    file_id:1    block_id:12672     blocks:8
rdba:0x00403b38    file_id:1    block_id:15160     blocks:8
rdba:0x00404c48    file_id:1    block_id:19528     blocks:8
rdba:0x00404c78    file_id:1    block_id:19576     blocks:8
rdba:0x00404cf8    file_id:1    block_id:19704     blocks:8
rdba:0x00404da8    file_id:1    block_id:19880     blocks:8
rdba:0x00404db8    file_id:1    block_id:19896     blocks:8
rdba:0x00404de8    file_id:1    block_id:19944     blocks:8
rdba:0x00404e80    file_id:1    block_id:20096     blocks:128
rdba:0x00405900    file_id:1    block_id:22784     blocks:128
rdba:0x00406500    file_id:1    block_id:25856     blocks:128
rdba:0x00406980    file_id:1    block_id:27008     blocks:128
rdba:0x00407480    file_id:1    block_id:29824     blocks:128
rdba:0x00407500    file_id:1    block_id:29952     blocks:128
rdba:0x00407680    file_id:1    block_id:30336     blocks:128
rdba:0x00407800    file_id:1    block_id:30720     blocks:128
…………

新删除data guard归档日志shell脚本

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:新删除data guard归档日志shell脚本

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

以前写过删除dataguard归档日志的方法(删除data guard归档日志),但是以前的方法确实不够灵活也不够简便,现在提供最新的一次在客户现场部署的dg删除归档日志的shell脚本

#!/bin/sh
source ~/.bash_profile
grep "Media Recovery Log" $ORACLE_BASE/admin/$ORACLE_SID/bdump/alert_${ORACLE_SID}.log|
\ awk '{print $4}'|sed -e 's/^/rm /' >/tmp/rmarchlog.sh
chmod +x /tmp/rmarchlog.sh
/tmp/rmarchlog.sh
cd $ORACLE_BASE/admin/$ORACLE_SID/bdump
cat alert_${ORACLE_SID}.log >>alert_${ORACLE_SID}.log.bak
echo ''>alert_${ORACLE_SID}.log
rm -f /tmp/rmarchlog.sh
$ORACLE_HOME/bin/rman target / <<XIFENFEI
crosscheck archivelog all;
delete expired archivelog all;
YES
exit;
XIFENFEI

根据alert日志中dg应用日志的信息”Media Recovery Log”信息来删除掉相关的归档日志,可以保证应用过的归档日志都被删除,而没有应用的归档日志都保留.

shell监控dataguard备库是否正常应用日志

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:shell监控dataguard备库是否正常应用日志

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一直在思索怎么去监控dg比较方便,又能够做到比较适用.想到了几种方法:
1.使用主备库两边的alert日志,但是这样的方法需要配置ssh,用来一个节点获取另外一个节点的alert日志
2.通过查询v$archived_log或者其他相关视图,然后主备库进行比较,但是这个需要访问另外一个库,需要另外库的登录信息
3.通过查询备库的v$archived_log视图,粗略评估dg是否工作正常.
这里我选择了3,dg的监控大部分时候是为了让人及时的发现日志应用异常,然后人工干预处理,从而减少修改gap或者重建dg的概率.而这个额监控可以在很大程度上发现dg应用归档日志异常,从而确定dg是否工作正常,如果发现工作异常,及时处理,可以减少很多工作量,甚至拯救你的数据.

#!/bin/bash
source ~/.bash_profile
#check time(M)
export CHECK_M=120
export RESULT_FILE=/tmp/dg_switch_check.log
$ORACLE_HOME/bin/sqlplus -silent "/ as sysdba" <<XFF>/tmp/check_dg.log
set pagesize 0 feedback off verify off heading off echo off
 select ceil((sysdate-next_time)*24*60) "M"
 from v\$archived_log
 where applied='YES' AND SEQUENCE#=(SELECT MAX(SEQUENCE#)
 FROM V\$ARCHIVED_LOG WHERE applied='YES');
exit;
XFF
GET_M=`cat /tmp/check_dg.log`
rm /tmp/check_dg.log
if [ ${CHECK_M} -lt ${GET_M} ];
then
    echo "check dataguard time:`date`">$RESULT_FILE
    echo "The last time application archivelog happened in $GET_M minutes ago">>$RESULT_FILE
else
 echo ''>$RESULT_FILE
fi

针对这样的脚本,根据你的dg归档切换的频率,设置监控dg的最近一次日志应用与当前时间差,然后判断dg是否工作正常.根据监控程序的特点,可以通过判断结果集文件,然后邮件/短信或者其他方式处理.

解决因/etc/fstab错误导致系统不能启动故障

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:解决因/etc/fstab错误导致系统不能启动故障

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

发现如果人品不好做试验都是问题很多,晚上又把fstab给写错了,导致系统不能起来,因为当时处理该故障未截图,后续在网上找了几张图片,大体说明处理思路
系统启动报 filesystems 失败,输入root密码进入repair filesystem模式

尝试修改 /etc/fstab 发现系统是read-only模式

重新mount -n -o remount,rw /重新mount文件系统

重新修改/etc/fstab,除掉错误记录,然后使用init 6/reboot命令重启系统