11.2 crs启动超时dd npohasd 处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:11.2 crs启动超时dd npohasd 处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户由于光纤链路故障导致表决盘异常从而使得主机重启,主机重启之后,集群没有正常启动
操作系统和crs版本

[root@rac1 ~]# cat /etc/redhat-release 
CentOS release 6.9 (Final)
[root@rac1 ~]# sqlplus -v

SQL*Plus: Release 11.2.0.4.0 Production

人工启动crs hang住一段时间然后报错

[root@rac1 ~]# crsctl start crs
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.

查看启动进程

[grid@rac1 ~]$ ps -ef|grep d.bin
root       7043      1  0 11:48 ?        00:00:00 /u01/app/grid/product/11.2.0/bin/ohasd.bin reboot
root       8311      1  0 11:53 ?        00:00:00 /u01/app/grid/product/11.2.0/bin/ohasd.bin reboot
grid      10984  10954  0 12:10 pts/2    00:00:00 grep d.bin

根据经验这个故障很可能就是BUG:17229230 – DURING REBOOT, “OHASD.BIN REBOOT” REMAINS SLEEPING,临时解决方案,一个会话启动crs,然后在另外一个会话发起

/bin/dd if=/var/tmp/.oracle/npohasd of=/dev/null bs=1024 count=1

后续crs启动正常

[root@rac1 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[root@rac1 ~]# crsctl status res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  OFFLINE                               Instance Shutdown   
ora.cluster_interconnect.haip
      1        ONLINE  OFFLINE                                                   
ora.crf
      1        ONLINE  ONLINE       rac1                                         
ora.crsd
      1        ONLINE  OFFLINE                                                   
ora.cssd
      1        ONLINE  OFFLINE                               STARTING            
ora.cssdmonitor
      1        ONLINE  ONLINE       rac1                                         
ora.ctssd
      1        ONLINE  OFFLINE                                                   
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.evmd
      1        ONLINE  OFFLINE                                                   
ora.gipcd
      1        ONLINE  ONLINE       rac1                                         
ora.gpnpd
      1        ONLINE  ONLINE       rac1                                         
ora.mdnsd
      1        ONLINE  ONLINE       rac1                                         

终止dd命令,集群启动正常