kdsgrp1 – Database SOS

ora-600 kdsgrp1 错误描述

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：ora-600 kdsgrp1 错误描述

当 fetch作找不到预期的行时，会引发 ora-600 [kdsgrp1] 错误。该错误在内存中命中，因此可能是仅内存错误或由磁盘损坏导致的错误。

此错误可能表示（但不限于）以下任何情况：

丢失写入
并行 DML 问题
索引损坏
数据块损坏
一致性读取 [CR] 问题
缓冲区缓存损坏

说明 285586.1 - ORA-600 [kdsgrp1] 中
提供了已知问题的完整列表：
每个错误都有一个简短描述，指示遇到它的情况。可以通过选择您的数据库版本来缩短 bug 列表，以仅显示可能影响您的问题。

此问题可能是间歇性的，也可能持续存在，直到修复底层磁盘级别损坏为止。间歇性问题可能是基于内存的（但是，对损坏的间歇性访问可能会与间歇性内存问题相混淆）。

常见的解决方法

如果问题仅在内存中，我们可以尝试通过刷新缓冲区缓存来立即解决问题，但请记住考虑对生产系统的性能影响：

更改系统刷新buffer_cache;

如果我们遇到间歇性一致性读取问题，我们可以尝试禁用 rowCR，这是一种优化，通过在初始化文件中设置 _row_cr=FALSE 来减少查询期间的一致性读取回滚。但是，这可能会导致查询的性能下降。请检查“RowCR hits”/“RowCR attempts”这两个统计信息的比率，以确定是否要使用解决方法。

如果这是索引损坏的结果，那么我们可以删除并重新构建索引。请注意，这将需要在生产系统上有一个 maintenance window。

根本原因确定
现在让我们看看我们如何发现问题的根本原因：查找此问题根本原因的第一步是检查生成的跟踪文件。ora-600 将在跟踪目录中生成跟踪文件，并在事件目录中的事件 ID 下生成事件文件。
跟踪文件的顶部告诉我们遇到错误时正在运行的 SQL：

—–此会话的当前 SQL 语句（sql_id=9mamr7xn4wg7x） —–

这立即向我们显示了访问的数据对象。在跟踪文件中搜索文本字符串 ‘Plan Table’ 将找到此跟踪文件中转储的 SQL 执行计划。对于持久性问题，这允许我们确定哪些索引已被访问，从而确定应验证以检查块损坏的索引：

SQL>分析索引 <OWNER>.<INDEX NAME>在线验证结构;

指数分析。

我们可以采取的另一种方法是使用 trace 文件中包含的 file 和 block 信息。在跟踪文件的顶部，我们将找到有关发现损坏的块的信息：

会话 ID：（3202.5644） 2011-03-19 04：12：16.910
行 07c7c8c7.a 在
文件# 31 块# 510151插槽 11 未找到的延续

此信息可用于识别 dba_extents 中的对象详细信息：

从 dba_extents 中选择 owner、segment_name、segment_type、partition_name，tablespace_name
其中 relative_fno = <文件 id>
并且 <block#> 在 block_id 和（block_id+blocks-1）之间;

然后我们可以验证这个对象，例如一个表和它的所有索引：

分析表 <OWNER>.<TABLE NAME>在线验证结构级联;

请记住，我们可能正在处理不在对象块本身中的永久损坏。这方面的示例包括：

可传输表空间作导致的字典损坏问题：检查 dba_tablespaces 以查看表空间是否已插入。
ASM 磁盘组镜像中的写入丢失 – 最有可能在存在大量 IO 和磁盘重新同步活动时看到。要检查此内容，请运行 dbms_diskgroup.checkfile 以检测镜像差异

如果 analyze 报告没有损坏，则检查表上是否有任何链接的行。如果存在这些，则可能存在未检测到的损坏，并且每当运行 SQL 时，问题都会再次出现。导出表也会检测到此问题。

如果 analyze 和 export 表（在存在链式行的情况下）都报告没有错误，则应将其视为一致性读取问题。

了解问题的性质后，您可以查看已知 bug 列表并确定哪个 bug 与您的条件匹配。如果您无法确定哪个问题影响了您，请向 Oracle 技术支持提交服务请求，并上传所有节点的 RDBMS 和 ASM（如果适用）实例警报日志、生成的任何跟踪和事件文件以及问题性质的完整描述。

Bug	Fixed	Description
32311758	23.1.0.0.0	ORA-600: internal error code, arguments: [kdsgrp1] on spatial physical standby database
32065006	23.1.0.0.0	Sdo_filter() fails with ORA-600: internal error code, arguments: [kdsgrp1]
32022223	19.12, 21.3.0.0.0	Sdo_filter fails with ORA-600: internal error code, arguments: [kdsgrp1]
28392179	19.11, 21.1.0.0.0	ORA-00600 [kdsgrp1] error on standby after intensive insert on the primary DB
29506942	18.11, 18.18, 19.8, 20.1	sdo_filter fails with ORA-600: internal error code, arguments: [kdsgrp1]
29311927	18.11, 18.18, 19.8, 20.1	sdo_filter fails with ORA-600: internal error code, arguments: [kdsgrp1]
28547478	12.2.0.1.DBRU:200714, 18.11, 18.18, 19.2, 20.1	ORA-600 [kdsgrp1] When Running Workload
27869764	19.1	Sdo_filter() call coredumps with [kdsgrp1] exception [optimized mbrs]
27397048	12.1.0.2.190115, 12.2.0.1.DBRU:190115, 18.18, 18.5, 19.1	Intermittent ORA-600[kdsgrp1] Raised By Query Using Index
26203182	11.2.0.4.200114, 12.1.0.2.190716, 12.2.0.1.DBRU:190115, 18.1	Lost Writes on ZFS if DNFS is enabled causing several Internal Errors. ORA-600 [kdsgrp1] ORA-8103 ORA-600 [3020] ORA-752 ORA-756
22581771	12.2.0.1.DBRU:180417, 18.1	ORA-600 [kdsgrp1] On Domain Index With Concurrent Insert And Select (With Clause)
21180699	18.1	ORA-7445/ ORA-00600 argument [kdibowrite()] / [kdibc3position()+78] / [20003] / [kcfrbd_3] / [25027] [kdsgrp1] with Execution plan ‘BITMAP’ access
22267274	12.2.0.1	CDB: Hit ORA-600 [kdsgrp1] and ORA-600 [4042]
17273253	12.1.0.1.1, 12.2.0.1	Various ORA-600 corruption errors with ASM
16195231	11.2.0.3.BP21, 11.2.0.4, 12.1.0.2, 12.2.0.1	ORA-7445 / ORA-600 from COMPRESSED table with LONG column
14576755	12.1.0.1.4, 12.1.0.2, 12.2.0.1	Corruption type ORA-600 errors from heavy concurrent DML on index cluster table
33005241	19.16, 21.7	ORA-00600 [kdsgrp1] error when using row CR
33599665	19.17	ORA-600 [kdsgrp_lost_piece] / ORA-600 [kdsgrp1-kdsgrp] While Running Flashback Query on FDA Enabled Table
31843845	19.13, 21.5	ORA-600 [kdsgrp1] Error or Wrong / Duplicate Results When Advanced Compressed Index Skip Scan Used to Access Rows
32417227	19.12	OLTP Compression Lock Bit Not Respected In Uncompressed Blocks
31228670	12.1.0.2.201020, 12.2.0.1.DBRU:201020, 18.12, 19.9	Corruption LOST Write : Rebalance disk resync causing lost write, mirror mismatches , several errors can be reported
31192039	12.2.0.1.DBRU:201020, 18.12, 18.18, 19.9	ORA-1554 and/or ORA-600 [kdsgrp1] While Deleting From A Compressed Index
31642462	19.14	ORA-600 [kdsgrp1-kdsgrp] when doing a version query using rowid having large row data with hybrid columnar compression enabled.
30651570	18.14, 18.18, 19.10	ORA-600: [kdsgrp1] After INSERT With APPEND Hint In Compressed Partitioned Table
32596207	21.0	ORA-600[kdsgrp1] failure using sdo_filter() function
29428230	18.11, 18.18, 19.8, 20.1	sdo_filter fails with ORA-600: internal error code, arguments: [kdsgrp1]
29362596	18.11.0.0.200714DBRU, 18.11, 18.18, 19.8.0.0.200714DBRU, 19.8, 20.1	sdo_filter fails with ORA-600: internal error code, arguments: [kdsgrp1]
29350868	18.11.0.0.200714DBRU, 18.11, 18.18, 19.8.0.0.200714DBRU, 19.8, 20.1	sdo_filter fails with ORA-600: internal error code, arguments: [kdsgrp1]
29139070	18.11.0.0.200714DBRU, 18.11, 18.18, 19.8.0.0.200714DBRU, 19.8, 20.1	very small adjacent insert causes index corruption ORA-600[kdsgrp1]
29048605	19.3.0.0.190416DBRU, 19.3, 20.1	index truncation causes index corruption ORA-600[kdsgrp1]
28881035	19.2, 19.2.0.0.181005R, 20.1	very small update causes index corruption ORA-600[kdsgrp1]
28802077	19.2, 19.2.0.0.181005R, 20.1	sdo_filter() fails with ORA-600[kdsgrp1]
27063461	19.11, 20.1	Physical Standby Hits ORA-600[kdbdmp_full:non-KDDBTDATA block. Use kcbtdu for it.]
28511632	23.4	Corruption LOST Write : Incomplete RMAN DUPLICATE can allow data file overwrites at Source database
27394954	19.1	sdo_filter fails with ORA-600 [kdsgrp1] after delete,insert,delete,insert,commit
27658186	12.2.0.1, 12.2.0.1.DBRU:190115, 18.5	ORA-600 [kdsgrp1] / Some rows not indexed in Text index in highly concurrent environment
24699619	12.2.0.1.171121DBRU, 12.2.0.1.171130WINDBBP, 12.2.0.1.DBRU:171121, 18.1	xdbstress hit ora 600 [kdsgrp1]
12690729	18.1	ORA-600 [kdsgrp1] errors when the active standby database recovery is enabled using CURRENT LOGFILE
22575209	12.2.0.1	ORA-600 [kdsgrp1] ORA-600 [25027] ORA-8103 ORA-1578 ORA-3254 in ADG Standby Database for Full Scan on ASSM segment – superseded
22519146	12.1.0.2.171017, 12.2.0.1	ORA-600 [kdsgrp1] or ORA-600 [kdsgrpcalcblockcount: hwmbno<=dbabno] or ORA-8103 in 12c on HCC Table in EXADATA
22241601	12.2.0.1	ORA-600 [kdsgrp1] ORA-1555 / ORA-600 [ktbdchk1: bad dscn] due to Invalid Commit SCN in INDEX block
21973601	12.2.0.0, 12.2.0.1	Querying a partitioned table may fail with ORA-00600 [kdsgrp1]
21634686	12.2.0.1	ORA-600 [kdsgrp1] / ORA-600 [ktfbhget:clsviol_kcbgcur_9] With Hybrid Columnar Compression (HCC)
21532755	11.2.0.4.171017, 12.1.0.2.171017, 12.2.0.1	ORA-600 [25027] By Concurrent queries while Create Index Online or ORA-8102 Table/Index mistmatch after Create Index Online or ONLINE_INDEX_CLEAN wait for DMLs
21096955	12.2.0.1	ORA-600 [kdsgrp1] / ORA-600 [ktfbhget:clsviol_kcbgcur_9] With Hybrid Columnar Compression (HCC)
19689979	11.2.0.4.170718, 12.1.0.2.160119, 12.1.0.2.DBBP07, 12.2.0.1	ORA-8103 or ORA-600 [ktecgsc:kcbz_objdchk] or Wrong Results on PARTITION table after TRUNCATE in 11.2.0.4 or above
19630914	12.2.0.1	ORA-600 [kdsgrp1] And Other Errors ORA-600 [6033] When BigSCN Testing Is Enabled
19614585	11.2.0.4.BP17, 12.1.0.2.DBBP03, 12.2.0.1	Wrong Results / ORA-600 [kksgaGetNoAlloc_Int0] / ORA-600 [12406] / ORA-7445 / ORA-8103 / ORA-1555 from query on RAC ADG Physical Standby Database
18607546	11.2.0.4.6, 11.2.0.4.BP16, 12.1.0.2.3, 12.1.0.2.DBBP06, 12.2.0.1	ORA-600 [kdblkcheckerror]..[6266] corruption with self-referenced chained row. ORA-600 [kdsgrp1] / Wrong Results / ORA-8102
18311351	12.2.0.1	ORA-1/ORA-10388 ORA-7445 [kdzsbuffercupiece_col] ORA-600 [kdsgrp1]/ORA-1499 Wrong Results, Index Inconsistency after Parallel Direct Path Insert of HCC table in EXADATA
17779978	12.2.0.1	ORA-00600 [kdsgrp1] & ORA-7445 [hshhsv] & ORA-7445 [pkrcd] errors on CDB
17761775	11.2.0.3.9, 11.2.0.3.BP22, 11.2.0.4.2, 11.2.0.4.BP03, 12.1.0.1.3, 12.1.0.2, 12.2.0.1	ORA-600 [kclchkblkdma_3] ORA-600 [3020] or ORA-600 [kcbchg1_16] Join of temp and permanent table in RAC might lead to corruption – superseded
17357359	12.1.0.2, 12.2.0.1	ORA-600 [kdsgrp1] during fetch by rowid
17160362	12.1.0.2, 12.2.0.1	ORA-600 [kdsgrp1] & [kclchkblk_3] & [kclchkblkdma_3] in rdbms
16849623	12.1.0.2, 12.2.0.1	ORA-600 [kdsgrp1] While Running Workload On Tables With Chained Rows
16698629	12.1.0.2, 12.2.0.1	ORA-600 [kdsgrp1] executing SELECT on table modified by a loosely coupled clusterwide global transaction
16555614	12.1.0.2, 12.2.0.1	mdidxridchk() causes buffer overrun problem when more than 4000 rows selected
16345143	12.2.0.1	Event 10231 does not skip row for IOT with non-existent nrid
14044260	12.1.0.2, 12.2.0.1	Update DML with long bind LOB that moves row to new partition fails with ORA-600 [kdsgrp1] – superseded
17449815	11.2.0.4.4, 11.2.0.4.BP11, 12.1.0.2, 12.2.0.1	ORA-8102 ORA-1499 after ORA-1/ORA-2291 by MERGE with DML ERROR LOGGING
17204397	12.1.0.2, 12.2.0.1	ORA-8005 ORA-8103 ORA-1410 ORA-600 [kdsgrp1] on Bitmap Index. Root Block may be repeatedly pinned/unpinned
16844448	11.2.0.3.9, 11.2.0.3.BP22, 11.2.0.4, 12.1.0.2, 12.2.0.1	ORA-600 [3020] after flashback database in a RAC
16563781	12.1.0.2, 12.1.0.2.180116, 12.2.0.1	version query may return wrong result on a table in TTS tablespace
21425496	11.2.0.4.190416, 12.1.0.1, 12.1.0.2.190716	ORA-752 or ORA-600 [3020] on recovery of Block Cleanout Operation OP:4.6
17518816	12.1.0.0, 12.1.0.1	ORA-600 [kdsgrp1] on select statements on a Active Dataguard Standby database
14790903	11.2.0.4, 12.1.0.1	ora 600 [kdsgrp1]
14527172	12.1.0.1	ORA-600 [4097] And [kdsgrp1] After unplugging and plugging the PDB In RAC Environment
13614906	12.1.0.1	ORA-600 [kdsgrp1] due to missing weak changes from an XA transaction in RAC – superceded
13399500	11.2.0.3.BP15, 11.2.0.4, 12.1.0.1	ORA-600 [kdsgrp1] when updating a chained rows on a ehcc table
13146182	11.2.0.2.11, 11.2.0.2.BP17, 11.2.0.3.10, 11.2.0.3.BP07, 11.2.0.4, 12.1.0.1	ORA-1499 ORA-8102 ORA-600 [kdsgrp1] Bitmap Index / Table mismatch
12821418	11.2.0.3.8, 11.2.0.3.BP18, 11.2.0.4, 12.1.0.1	Direct NFS appears to be sending zero length windows to storage device. It may also cause Lost Writes
12619529	11.2.0.3.BP18, 11.2.0.4, 12.1.0.1	ORA-600[kdsgrp1] from SELECT on plugged in tablespace with FLASHBACK
12330911	12.1.0.1	EXADATA LSI firmware for lost writes
10633840	11.2.0.2.7, 11.2.0.2.BP17, 11.2.0.3, 12.1.0.1	ORA-1502 on insert statement on INTERVAL partitioned table. ORA-8102 / ORA-1499 Index inconsistency
10245259	11.2.0.2.BP03, 11.2.0.3, 12.1.0.1	PARALLEL INSERT with +NOAPPEND hint or if PARALLEL INSERT plan is executed in SERIAL corrupts index and causes wrong results
10209232	11.1.0.7.7, 11.2.0.1.BP08, 11.2.0.2.1, 11.2.0.2.BP02, 11.2.0.2.GIBUNDLE01, 11.2.0.3, 12.1.0.1	ORA-1578 / ORA-600 [3020] Corruption. Misplaced Blocks and Lost Write in ASM
10205230	11.2.0.1.6, 11.2.0.1.BP09, 11.2.0.2.2, 11.2.0.2.BP04, 11.2.0.3, 12.1.0.1	ORA-600 / corruption possible during shutdown in RAC
9770451	10.2.0.5.3, 11.2.0.2.1, 11.2.0.2.BP02, 11.2.0.3, 12.1.0.1	ORA-600 [20022] with bitmap indexes
9734539	11.2.0.2, 12.1.0.1	ORA-8102 / ORA-1499 corrupt index after update/merge using QUERY REWRITE
9469117	10.2.0.5.4, 11.2.0.1.BP04, 11.2.0.2, 12.1.0.1	Corrupt index after PDML executed in serial. Wrong results. OERI[kdsgrp1]/ORA-1499 by analyze
9457185	11.2.0.1.BP12, 11.2.0.2, 12.1.0.1	Intermittent ORA-600 [kdsgrp1] during CR read
9231605	11.1.0.7.4, 11.2.0.1.3, 11.2.0.1.BP02, 11.2.0.2, 12.1.0.1	Block corruption with missing row on a compressed table after DELETE
9145541	11.1.0.7.4, 11.2.0.1.2, 11.2.0.2, 12.1.0.1	OERI[25027]/OERI[4097]/OERI[4000]/ORA-1555 in plugged datafile after CREATE CONTROLFILE in 11g
9061269	11.2.0.2, 12.1.0.1	ORA-600 [kdsgrp1] executing CTX_QUERY.COUNT_HITS during concurrent sync Text index
8951812	11.2.0.2, 12.1.0.1	Corrupt index by rebuild online. Possible OERI [kddummy_blkchk] by SMON
8837919	11.2.0.2, 12.1.0.1	DBV / RMAN enhanced to detect ASSM blocks with ktbfbseg but not ktbfexthd flag set as in Bug 8803762
8803762	11.1.0.7.6, 11.2.0.1.2, 11.2.0.1.BP06, 11.2.0.2, 12.1.0.1	ORA-600[kdsgrp1], ORA-600[25027] or wrong results on 11g database upgrade from 9i
8771916	10.2.0.5.3, 11.1.0.7.6, 11.2.0.1.BP12, 11.2.0.2, 12.1.0.1	OERI [kdsgrp1] during CR read
8635179	10.2.0.5, 11.2.0.2, 12.1.0.1	Solaris: directio may be disabled for RAC file access. Corruption / Lost Write
8597106	11.2.0.1.BP06, 11.2.0.2, 12.1.0.1	Lost Write in ASM when normal redundancy is used
8546356	10.2.0.5.1, 11.2.0.1.3, 11.2.0.1.BP07, 11.2.0.2, 12.1.0.1	ORA-8102/ORA-1499/OERI[kdsgrp1] Composite Partitioned Index corruption after rebuild ONLINE in RAC
7710827	11.2.0.2, 12.1.0.1	Index rebuild or Merge partition causes wrong results in concurrent reads instead of ORA-8103
7705591	10.2.0.5, 11.2.0.1.1, 11.2.0.1.BP04, 11.2.0.2, 12.1.0.1	Corruption with self-referenced row in MSSM tablespace. Wrong Results / OERI[6749] / ORA-8102
7251049	11.2.0.1.BP08, 11.2.0.2, 12.1.0.1	Corruption in bitmap index introduced when using transportable tablespaces
16579042	11.2.0.4	ORA-600 [kjbmpocr:alh] ORA-600 [kclchkblkdma_3] by LMS in RAC which may lead to corruption
9527635	11.2.0.1.BP04, 11.2.0.2, 12.1.0.1	ORA-00600 [kdsgrp1] On Exadata
8650661	11.1.0.7.2, 11.2.0.1	OERI / corruption type errors using global transactions in RAC
8588540	11.1.0.7.2, 11.2.0.1	Corruption / ORA-8102 in RAC with loopback DB links between instances
7682186	11.2.0.1	ORA-600[kdsgrp1] on consistent read in RAC with global transaction
7329252	10.2.0.4.4, 10.2.0.5, 11.1.0.7.5, 11.2.0.1	ORA-8102/ORA-1499/OERI[kdsgrp1] Index corruption after rebuild index ONLINE
7289224	11.2.0.1	ORA-600 [kdsgrp1] on CR read with parallel query
6791996	11.2.0.1	ORA-600 errors for a DELETE with self referencing FK constraint and BITMAP index
6772911	10.2.0.5, 11.1.0.7.3, 11.2.0.1	OERI[12700] OERI[qertbFetchByRowID] OERI[kdsgrp1] due to bad CR rollback of INDEX block
6445948	10.2.0.4.4, 10.2.0.5, 11.1.0.7.8, 11.2.0.1	Intermitent ORA-600 [kdsgrp1] accessing table with a LONG
6404058	10.2.0.5, 11.1.0.7, 11.2.0.1	OERI:12700 OERI:kdsgrp1 OERI:qertbFetchByRowID wrong results from CR rollback of split index leaf
6129296	11.2.0.1	ORA-600 [kdsgrp1] by PARALLEL select for update with LOB
5621677	10.2.0.4, 11.1.0.6	Logical corruption with PARALLEL update
5374225	10.2.0.4, 11.1.0.6	SDO_FILTER query fails with OERI[kdsgrp1]
5368945	10.2.0.5, 11.1.0.6	ORA-600 [kdsgrp1] on Index Organized Table with Overflow
4883635	10.2.0.4, 11.1.0.6	MERGE (with DELETE) can produce wrong results or Logical corruption in chained rows
3408192	9.2.0.6, 10.1.0.3, 10.2.0.1	Heavy concurrent DML scenarios can cause $R table to contain deleted rowids

解决CON$ ORA-600 kdsgrp1错误

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：解决CON$ ORA-600 kdsgrp1错误

数据库报ORA 600 kdsgrp1错误
数据库报ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []错

Thread 1 advanced to log sequence 23861 (LGWR switch)
  Current log# 7 seq# 23861 mem# 0: /oradata/easdb/redo07.log
Tue Nov 15 10:00:42 2016
Errors in file /u01/oracle/diag/rdbms/easdb/easdb/trace/easdb_dw00_3165.trc  (incident=908262):
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/oracle/diag/rdbms/easdb/easdb/incident/incdir_908262/easdb_dw00_3165_i908262.trc
Tue Nov 15 10:00:55 2016
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue Nov 15 10:00:56 2016
Errors in file /u01/oracle/diag/rdbms/easdb/easdb/trace/easdb_dw00_3165.trc  (incident=908263):
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.KUPW$WORKER", line 1751
ORA-06512: at line 2
Incident details in: /u01/oracle/diag/rdbms/easdb/easdb/incident/incdir_908263/easdb_dw00_3165_i908263.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
DW00 terminating with fatal err=600, pid=40, wid=1, job SYSTEM.
Tue Nov 15 10:01:01 2016
Thread 1 advanced to log sequence 23862 (LGWR switch)
  Current log# 2 seq# 23862 mem# 0: /oradata/easdb/redo02.log
Tue Nov 15 10:01:23 2016
Errors in file /u01/oracle/diag/rdbms/easdb/easdb/trace/easdb_dm00_3163.trc  (incident=908254):
ORA-31671: Worker process DW00 had an unhandled exception.
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
ORA-06512: at "SYS.KUPW$WORKER", line 1751
ORA-06512: at line 2
Incident details in: /u01/oracle/diag/rdbms/easdb/easdb/incident/incdir_908254/easdb_dm00_3163_i908254.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue Nov 15 10:01:26 2016
Tue Nov 15 10:01:28 2016
Thread 1 advanced to log sequence 23863 (LGWR switch)
  Current log# 4 seq# 23863 mem# 0: /oradata/easdb/redo04.log

trace文件中信息

*** 2016-11-15 10:00:35.977
* kdsgrp1-1: *************************************************
            row 0x004459e6.26 continuation at
            0x004459e6.26 file# 1 block# 285158 slot 38 not found
KDSTABN_GET: 0 ..... ntab: 1
curSlot: 38 ..... nrows: 208
kdsgrp - dump CR block dba=0x004459e6
Block header dump:  0x004459e6
 Object id on Block? Y
 seg/obj: 0x1c  csc: 0x01.c712f743  itc: 3  flg: -  typ: 1 - DATA
     fsl: 0  fnx: 0x0 ver: 0x01
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000b.015.0036d715  0x00c01bba.0fbd.02  C---    0  scn 0x0001.c6b4cb1a
0x02   0x000c.004.00044d36  0x04c0dd93.3eec.33  C---    0  scn 0x0001.c6d2c65b
0x03   0x000d.008.00008eb9  0x04c0777a.10e3.02  --U-    2  fsc 0x0056.c7346f21

确定报错对象和确认异常

SQL> select object_name from dba_objects where object_id=28;
OBJECT_NAME
---------------------------------------------------------
CON$
SQL> ANALYZE TABLE sys.CON$ VALIDATE STRUCTURE CASCADE online;
ANALYZE TABLE sys.CON$ VALIDATE STRUCTURE CASCADE online
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file
SQL> SET LINES 122
SQL> COL INDEX_OWNER FOR A20
SQL> COL INDEX_NAME FOR A30
SQL> COL TABLE_OWNER FOR A20
SQL> COL COLUMN_NAME FOR A25
SQL> SELECT TABLE_OWNER,INDEX_NAME,COLUMN_NAME,COLUMN_POSITION
2  FROM Dba_Ind_Columns
3  WHERE table_name = upper('&TABLE_NAME') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION;
Enter value for table_name: CON$
old   3:  WHERE table_name = upper('&TABLE_NAME') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION
new   3:  WHERE table_name = upper('CON$') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION
TABLE_OWNER	     INDEX_NAME 		    COLUMN_NAME 	      COLUMN_POSITION
-------------------- ------------------------------ ------------------------- ---------------
SYS		     I_CON1			    OWNER#				    1
SYS		     I_CON1			    NAME				    2
SYS		     I_CON2			    CON#				    1
SQL> select owner#,name from con$
2    minus
3   select /*+ full(t) */owner#,name from con$ t;
no rows selected
SQL> select /*+ full(t) */owner#,name from con$ t
2    minus
3   select owner#,name from con$  ;
no rows selected
SQL> select /*+ full(t) */ con# from con$ t
2    minus
3   select con# from con$ ;
no rows selected
SQL> select con# from con$
2    minus
3   select /*+ full(t) */ con# from con$ t   ;
      CON#
----------
   1037224
   1037225
   1037386
   1037387
   1037388
   ……
   1037846
62 rows selected.

通过上述分析，可以确定是由于CON$和I_CON2数据不一致,而且是index的数据比表中多了62条.针对这样情况,考虑通过重建index来解决.

尝试rebuild index

SQL> alter index I_CON2 rebuild online;
alter index I_CON2 rebuild online
*
ERROR at line 1:
ORA-00701: object necessary for warmstarting database cannot be altered
SQL>
SQL>
SQL>
SQL>
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup upgrade
ORACLE instance started.
Total System Global Area 2421825536 bytes
Fixed Size                  2215744 bytes
Variable Size            1828716736 bytes
Database Buffers          570425344 bytes
Redo Buffers               20467712 bytes
Database mounted.
Database opened.
SQL> alter index I_CON2 rebuild;
alter index I_CON2 rebuild
*
ERROR at line 1:
ORA-00701: object necessary for warmstarting database cannot be altered

因为是数据库核心index,无法直接rebuild解决,只能通过bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决方式解决

ORA-600 kdsgrp1

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：ORA-600 kdsgrp1

在硬件恢复,断电,redo异常等恢复case中ORA-600 [kdsgrp1]是一个比较常见的错误,这里该出来官方关于该错误的解释说明和处理方法

RROR:
  Format: ORA-600 [kdsgrp1]
VERSIONS:
  versions 10.1 and above
DESCRIPTION:
 This error was introduced in 10g with the fix to Bug 2442351, it provides
 for an extra health check on a block, we detected a null row header,
 see Note:2442351.9 for more information.
 Error may be caused by:
 Case 1. A row referenced in an index that does not exist in the table.
 Case 2. An non-existent rowid pointed to by a chained row.
 Trace Examples:
 Case 1. Mismatch between table and index:
====================================================
 Trace file has:
 row 02433566.13 continuation at
 file# 9 block# 210278 slot 20 not found
 The file=9 block=210278 is rdba=0x02433566 which was taken from an index:
 row#3[7549] flag: ------, lock: 0, len=85, data:(6):  02 43 35 66 00 14
 But the slot 20 does not exist in the table block:
 tab 0, row 1, @0x1e62
 tl: 2 fb: --HDFL-- lb: 0x3
 tab 0, row 12, @0x191a
 tl: 2 fb: --HDFL-- lb: 0x1
 tab 0, row 17, @0x1675
 tl: 2 fb: --HDFL-- lb: 0x2
 tab 0, row 21, @0x1459
 tl: 2 fb: --HDFL-- lb: 0x4
 ORA-1499 may be produced by analyze:
 analyze table <table name> validate structure cascade;
 Case 2. A row points to another rowid which does not exist (Chained row does not exist).
============================================================================================
 Trace file has:
 row 1186b11a.ffffffff continuation at
 file# 70 block# 441621 slot 1 not found
 It means that row with rdba 0x1186b11a continues in file# 70 block# 441621 slot 1.
 But the information in file# 70 block# 441621 slot 1 does not exist.  It is:
 tab 0, row 16, @0xd7f    ---> This is the slot with the problem.
 tl: 29 fb: -------- lb: 0x0  cc: 11
 nrid:  0x1186bd15.1      ---> It points to rdba=0x1186bd15 slot 1
(file# 70 block# 441621 slot 1) but that row does not exist in that block.
 For this case ANALYZE TABLE .. VALIDATE STRUCTURE is not detecting this logical corruption
Referece Bug 6858313
Run an export (exp) or Full Table Scan to identify if there is a permanent invalid chained row.
 Workaround for Case 2:
 The row producing the ORA-600 [kdsgrp1] can be skipped by setting the Event 10231
 Note that a testcase has concluded that event 10231 does not skip rows in an Index Organized Table (IOT)
 when there is an invalid nrid as explained in Case 2.  It only works for regular tables.
 Event 43810 skip corrupt block in IOT?s (10.2.0.4)
nor  parameter _index_scan_check_skip_corrupt (11g) work for this case 2 on IOTs either.
FUNCTIONALITY:
  Kernel Data layer Seek/Scan
IMPACT:
  PROCESS FAILURE
  POSSIBLE PHYSICAL CORRUPTION

某集团ebs数据库redo undo丢失导致悲剧

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：某集团ebs数据库redo undo丢失导致悲剧

某集团的ebs系统因磁盘空间不足把redo和undo存放到raid 0之上，而且该库无任何备份。最终悲剧发生了,raid 0异常导致redo undo全部丢失,数据库无法正常启动(我接手之时数据库已经resetlogs过,但是未成功)

Sun Jul 27 11:31:27 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sun Jul 27 11:31:27 2014
Database Characterset is ZHS16GBK
Sun Jul 27 11:31:27 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_smon_454754.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 42 cannot be read at this time
ORA-01110: data file 42: '/prod/oracle/PROD/logdata/undo/undo1.dbf'
Sun Jul 27 11:31:27 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_smon_454754.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 42 cannot be read at this time
ORA-01110: data file 42: '/prod/oracle/PROD/logdata/undo/undo1.dbf'
Sun Jul 27 11:31:27 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_smon_454754.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 42 cannot be read at this time
ORA-01110: data file 42: '/prod/oracle/PROD/logdata/undo/undo1.dbf'
Sun Jul 27 11:31:27 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_663670.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 41 cannot be read at this time
ORA-01110: data file 41: '/prod/oracle/PROD/logdata/undo/undo2.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 663670
ORA-1092 signalled during: ALTER DATABASE OPEN...

查询相关文件状态发现,undo表空间文件丢失,被offline处理
df_status
因为以前alert日志被清理,通过这里大概猜测是offline丢失的undo文件,然后resetlogs了数据库,现在处理方式为
使用_corrupted_rollback_segments屏蔽回滚段,然后尝试启动数据库

Tue Jul 29 11:40:39 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Tue Jul 29 11:40:39 2014
Database Characterset is ZHS16GBK
Tue Jul 29 11:40:39 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_smon_569378.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number  with name "" too small
Tue Jul 29 11:40:39 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_smon_569378.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number  with name "" too small
Tue Jul 29 11:40:39 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_smon_569378.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number  with name "" too small
Tue Jul 29 11:40:39 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_585786.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number  with name "" too small
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 585786
ORA-1092 signalled during: alter database open...

该错误是由于数据库启动需要找到对应的回滚段,但是由于undo异常导致该回滚段无法找到，因此出现该错误，解决方法是通过修改数据scn，让其不找回滚段,从而屏蔽该错误.数据库启动后,删除undo重新创建新undo

Tue Jul 29 15:59:22 2014
drop tablespace undo2 including contents and datafiles
Tue Jul 29 15:59:23 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_782490.trc:
ORA-01122: database file 41 failed verification check
ORA-01110: data file 41: '/prod/oracle/PROD/logdata/undo/undo2.dbf'
ORA-01565: error in identifying file '/prod/oracle/PROD/logdata/undo/undo2.dbf'
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: No such file or directory
Additional information: 3
Tue Jul 29 15:59:23 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_782490.trc:
ORA-01259: unable to delete datafile /prod/oracle/PROD/logdata/undo/undo2.dbf
Tue Jul 29 15:59:23 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_782490.trc:
ORA-01122: database file 42 failed verification check
ORA-01110: data file 42: '/prod/oracle/PROD/logdata/undo/undo1.dbf'
ORA-01565: error in identifying file '/prod/oracle/PROD/logdata/undo/undo1.dbf'
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: No such file or directory
Additional information: 3
ORA-01259: unable to delete datafile /prod/oracle/PROD/logdata/undo/undo2.dbf
Tue Jul 29 15:59:23 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_782490.trc:
ORA-01259: unable to delete datafile /prod/oracle/PROD/logdata/undo/undo1.dbf
Tue Jul 29 15:59:23 2014
Completed: drop tablespace undo2 including contents and datafiles
Tue Jul 29 15:59:56 2014
create undo tablespace undotbs1 datafile '/prod/oracle/PROD/logdata/undo_new01.dbf' size 100M autoextend on next 128M maxsize 30G
Tue Jul 29 15:59:57 2014
Completed: create undo tablespace undotbs1 datafile '/prod/oracle/PROD/logdata/undo_new01.dbf' size 100M autoextend on next 128M maxsize 30G
Tue Jul 29 16:00:03 2014
alter tablespace undotbs1 add datafile '/prod/oracle/PROD/logdata/undo_new02.dbf' size 100M autoextend on next 128M maxsize 30G
Completed: alter tablespace undotbs1 add datafile '/prod/oracle/PROD/logdata/undo_new02.dbf' size 100M autoextend on next 128M maxsize 30G

业务运行过程中,数据库报大量ORA-600 4097,ORA-600 kdsgrp1,ORA-600 kcfrbd_3错误

Tue Jul 29 16:07:03 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_950484.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Tue Jul 29 16:07:06 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_950484.trc:
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
Tue Jul 29 16:10:06 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_917702.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Tue Jul 29 16:10:07 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_917702.trc:
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
Tue Jul 29 16:12:45 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_m000_880692.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Tue Jul 29 16:21:23 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_1040638.trc:
ORA-00600: 内部错误代码, 参数: [kcfrbd_3], [41], [231381], [1], [12800], [12800], [], []
Tue Jul 29 16:21:37 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_1040638.trc:
ORA-00600: 内部错误代码, 参数: [kcfrbd_3], [41], [231381], [1], [12800], [12800], [], []
Tue Jul 29 16:21:56 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_1040638.trc:
ORA-00600: 内部错误代码, 参数: [kcfrbd_3], [41], [231381], [1], [12800], [12800], [], []
Tue Jul 29 16:22:18 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_1040638.trc:
ORA-00600: 内部错误代码, 参数: [kcfrbd_3], [41], [231381], [1], [12800], [12800], [], []
Tue Jul 29 16:22:28 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_1105950.trc:
ORA-00600: 内部错误代码, 参数: [4097], [], [], [], [], [], [], []
Tue Jul 29 16:22:33 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_1159232.trc:
ORA-00600: 内部错误代码, 参数: [kcfrbd_3], [42], [61235], [1], [12800], [12800], [], []

出现该错误有几个原因和解决方法：
ORA-600 kdsgrp1 是因为相关坏块引起(tab,index,memory,cr block等),结合日志分析对象异常原因,根据具体情况确定对象然后选择合适处理方案(具体参考NOTE:1332252.1)
ORA-600 4097 由于数据库异常关闭然后open,创建回滚段,可能触发bug导致该问题(虽然说在当前版本修复,但是实际处理我确实按照NOTE:1030620.6解决)
ORA-600 kcfrbd_3 有事务的block被访问之后,根据回滚槽信息定位到相关回滚段,而正好新建的回滚段信息又和以前的名字编号一致,从而反馈出来是数据文件大小不够,从而出现该错误(具体参考NOTE:601798.1)
最终该数据库虽然恢复了,抢救了大量数据,但是对于ebs系统来说,丢失redo和undo数据的损失还是巨大的.再次温馨提示:数据库的redo,undo也很重要,数据库的备份更加重要

标签归档：kdsgrp1