因使用OEM引起ORA-00600[12761]

alert日志报ORA-00600[12761]错

Sun Jun 10 13:52:56 2012
Errors in file e:\oracle\product\10.2.0\admin\interlib\udump\interlib_ora_19840.trc:
ORA-04030: 在尝试分配 82444 字节 (pga heap,control file i/o buffer) 时进程内存不足
ORA-00600: 内部错误代码, 参数: [12761], [], [], [], [], [], [], []
ORA-00604: 递归 SQL 级别 2 出现错误
ORA-04030: 在尝试分配 123404 字节 (QERHJ hash-joi,kllcqas:kllsltba) 时进程内存不足

数据库版本信息

Sun Jun 10 13:52:56 2012
ORACLE V10.2.0.1.0 - Production vsnsta=0
vsnsql=14 vsnxtr=3
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
Windows NT Version V5.2 Service Pack 2
CPU                 : 8 - type 586, 2 Physical Cores
Process Affinity    : 0x00000000
Memory (Avail/Total): Ph:1263M/4095M, Ph+PgF:2716M/5976M, VA:19M/2047M
Instance name: interlib

trace信息

*** 2012-06-10 13:52:56.763
ksedmp: internal or fatal error
ORA-00600: 内部错误代码, 参数: [12761], [], [], [], [], [], [], []
ORA-00604: 递归 SQL 级别 2 出现错误
ORA-04030: 在尝试分配 123404 字节 (QERHJ hash-joi,kllcqas:kllsltba) 时进程内存不足
Current SQL statement for this session:
BEGIN EM_PING.RECORD_BATCH_HEARTBEAT(:1, :2, :3); END;
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0BC35C44         1  anonymous block
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
00404686             CALLrel  00404694             0 1
0040307E             CALLrel  00404660             0
0043AB6A             CALLrel  00402CFC             3
603A816A             CALLreg  00000000             6190E2E0 3
603A8550             CALLrel  603A80D8             6190E2E0 5E340020 31D9 0
                                                   5ED6CDF8
031B7197             CALLrel  025FA21E             6190E2E0 5E340020 31D9 2 0
02C92859             CALLrel  02C92360             5ED6D2B4 5ED6D3CC 2 61BA71E4
                                                   0 5ED6CEFA
60BAD7C6             CALL???  00000000             5ED6D2B4 5ED6D3CC 3 61BA71E4
                                                   0 5ED6CEFA
60C41C40             CALLrel  60BAD758             5D9356A0 F1 3 5E344888
60C3C780             CALL???  00000000             5D9356A0 951F190 5D9356DC
60C3D1BD             CALLrel  60C3C748             5D9356A0 95A97F0 5D9356DC
60BB0392             CALLrel  60C3CEB0             5D9356A0
60B89393             CALLrel  60BB00B0             5D9356A0 1 0
02600CD9             CALLrel  0260F22C
0140AF2C             CALLrel  02600B3C             4E8EC08
013CBFEC             CALLrel  01409984             49 3 5ED6DB14
0085174B             CALLreg  00000000             5E 17 5ED6F6F8
60FEFF8D             CALLreg  00000000             5E 17 5ED6F6F8 0
00850A69             CALL???  00000000
0122134B             CALLrel  00850670             0 0
0085174B             CALLreg  00000000             3C 4 5ED6FC90
00420E53             CALLrel  00851300             3C 4 5ED6FC90 0
00421645             CALLrel  00420B20             3C 4 5ED6FC90
0040116C             CALLrel  00421618             5ED6FC84 3C 4 5ED6FC90
0040105C             CALLrel  004010FC             2 5ED6FCBC
00401900             CALLrel  00401000
7C82482C             CALLreg  00000000
--------------------- Binary Stack Dump ---------------------
--会话信息
    (session) sid: 525 trans: 00000000, creator: 7AE024D8, flag: (41) USR/- BSY/-/-/-/-/-
              DID: 0001-001B-00000004, short-term DID: 0000-0000-00000000
              txn branch: 00000000
              oct: 47, prv: 0, sql: 7A0F0A38, psql: 7A0A2430, user: 51/SYSMAN
    O/S info: user: , term: , ospid: 1234, machine: tushuguan01
              program: OMS
    client info: tushuguan01_Management_Service
    application name: OEM.SystemPool, hash value=2960518376

通过这里我们可以得到几个信息
1.数据库先发生了ORA-00600[12761],然后引发了ORA-04030
2.引发ORA-00600[12761]错误的原因是因为OEM的某种操作导致
3.未知因某种原因导致Call Stack Trace信息不完善,无法准确评估bug情况
4.查询数据库当前最大使用使用pga为250M,数据库配置pga为500M,原则上讲不是pga消耗完导致4030错误,可能是这个会话在执行某个基表的查询时候的hash-jion运算时pga不足导致.
5.查询dba_users发现EM_PING不是数据库用户,查询dba_source发现RECORD_BATCH_HEARTBEAT不是plsql名称,从这里可以看出OEM调用程序有一定特殊性

对于该问题的解决方案
1.因为OEM功能不太使用,建议直接关闭该进程,并设置为开机不自动启动
2.因为信息不完善,无法确定具体bug,但目前数据库版本为10.2.0.1,强烈建议升级到新版本

autotrace显示Statistics很多信息为0

一朋友使用autotrace查看数据库执行计划发现结果如下,Statistics中很多信息为0,这个肯定是不正常现象,什么都可以为0,consistent gets也不可能为0.

SQL> set autot on
SQL> select count(*) from RACV_DATA.PARTY_DUMMY;
  COUNT(*)
----------
        47
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 3621440939
--------------------------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |             |     1 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |             |     1 |            |          |
|   2 |   TABLE ACCESS FULL| PARTY_DUMMY |    47 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
          0  bytes sent via SQL*Net to client
          0  bytes received via SQL*Net from client
          0  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

解决方法

SQL>  select owner,object_name from dba_objects where object_name='PLAN_TABLE';
no rows selected
SQL> @?/rdbms/admin/utlxplan.sql
Table created.

重新查看Statistics信息

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        522  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

记录一次ORA-00600[2252]故障解决

数据库alert日志报ORA-00600[2252]

Wed Jun 06 08:56:02 2012
Thread 1 cannot allocate new log, sequence 552
Checkpoint not complete
  Current log# 1 seq# 551 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO01.LOG
Thread 1 advanced to log sequence 552
  Current log# 2 seq# 552 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO02.LOG
Sun Jun 06 09:39:19 2010
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_m000_3344.trc:
ORA-00600: 内部错误代码, 参数: [2252], [2834], [4076554712], [], [], [], [], []
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_m000_3344.trc:
ORA-00600: 内部错误代码, 参数: [2252], [2834], [4076554712], [], [], [], [], []
Sun Jun 06 10:19:49 2010
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_m000_4904.trc:
ORA-00600: 内部错误代码, 参数: [2252], [2834], [4076555573], [], [], [], [], []
Sun Jun 06 10:20:49 2010
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_m000_5984.trc:
ORA-00600: 内部错误代码, 参数: [2252], [2834], [4076555594], [], [], [], [], []
Sun Jun 06 10:21:49 2010
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_m000_4204.trc:
ORA-00600: 内部错误代码, 参数: [2252], [2834], [4076555614], [], [], [], [], []
Sun Jun 06 10:22:49 2010
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_m000_5896.trc:
ORA-00600: 内部错误代码, 参数: [2252], [2834], [4076555634], [], [], [], [], []
Sun Jun 06 10:23:49 2010
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_m000_4612.trc:
ORA-00600: 内部错误代码, 参数: [2252], [2834], [4076555654], [], [], [], [], []
Sun Jun 06 10:24:49 2010
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_m000_4696.trc:
ORA-00600: 内部错误代码, 参数: [2252], [2834], [4076555676], [], [], [], [], []
Sun Jun 06 10:25:50 2010
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_m000_5568.trc:
ORA-00600: 内部错误代码, 参数: [2252], [2834], [4076555696], [], [], [], [], []
Sun Jun 06 10:26:50 2010
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_m000_5776.trc:
ORA-00600: 内部错误代码, 参数: [2252], [2834], [4076555716], [], [], [], [], []
--启动数据库
Mon Jun 07 09:18:39 2010
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Mon Jun 07 09:18:49 2010
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_10 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on.
IMODE=BR
ILAT =18
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.1.0.
System parameters with non-default values:
  processes                = 150
  __shared_pool_size       = 100663296
  __large_pool_size        = 12582912
  __java_pool_size         = 4194304
  __streams_pool_size      = 0
  spfile                   = D:\ORACLE\PRODUCT\10.2.0\DB_1\DBS\SPFILEORCL.ORA
  nls_language             = SIMPLIFIED CHINESE
  nls_territory            = CHINA
  sga_target               = 452984832
  control_files            = D:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\CONTROL01.CTL
  db_block_size            = 8192
  __db_cache_size          = 327155712
  compatible               = 10.2.0.1.0
  db_file_multiblock_read_count= 16
  db_recovery_file_dest    = D:\oracle\product\10.2.0/flash_recovery_area
  db_recovery_file_dest_size= 2147483648
  undo_management          = AUTO
  undo_tablespace          = UNDOTBS1
  remote_login_passwordfile= EXCLUSIVE
  db_domain                =
  dispatchers              = (protocol=TCP)
  shared_servers           = 1
  job_queue_processes      = 10
  audit_file_dest          = D:\ORACLE\PRODUCT\10.2.0\ADMIN\ORCL\ADUMP
  background_dump_dest     = D:\ORACLE\PRODUCT\10.2.0\ADMIN\ORCL\BDUMP
  user_dump_dest           = D:\ORACLE\PRODUCT\10.2.0\ADMIN\ORCL\UDUMP
  core_dump_dest           = D:\ORACLE\PRODUCT\10.2.0\ADMIN\ORCL\CDUMP
  db_name                  = orcl
  open_cursors             = 300
  pga_aggregate_target     = 149946368
PSP0 started with pid=3, OS id=3028
MMAN started with pid=4, OS id=3528
PMON started with pid=2, OS id=2772
DBW0 started with pid=5, OS id=816
CKPT started with pid=7, OS id=3372
SMON started with pid=8, OS id=2584
RECO started with pid=9, OS id=3976
CJQ0 started with pid=10, OS id=1912
MMON started with pid=11, OS id=624
Mon Jun 07 09:19:00 2010
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
MMNL started with pid=12, OS id=2696
Mon Jun 07 09:19:00 2010
starting up 1 shared server(s) ...
LGWR started with pid=6, OS id=3128
Mon Jun 07 09:19:04 2010
alter database mount exclusive
Mon Jun 07 09:19:09 2010
Setting recovery target incarnation to 2
Mon Jun 07 09:19:10 2010
Successful mount of redo thread 1, with mount id 1248834568
Mon Jun 07 09:19:10 2010
Database mounted in Exclusive Mode
Completed: alter database mount exclusive
Mon Jun 07 09:19:10 2010
alter database open
Mon Jun 07 09:19:15 2010
Beginning crash recovery of 1 threads
 parallel recovery started with 2 processes
Mon Jun 07 09:19:18 2010
Started redo scan
Mon Jun 07 09:19:19 2010
Completed redo scan
 13 redo blocks read, 7 data blocks need recovery
Mon Jun 07 09:19:20 2010
Started redo application at
 Thread 1: logseq 552, block 28631
Mon Jun 07 09:19:20 2010
Recovery of Online Redo Log: Thread 1 Group 2 Seq 552 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO02.LOG
Mon Jun 07 09:19:20 2010
Completed redo application
Mon Jun 07 09:19:20 2010
Completed crash recovery at
 Thread 1: logseq 552, block 28644, scn 12176013920948
 7 data blocks read, 7 data blocks written, 13 redo blocks read
Mon Jun 07 09:19:28 2010
Errors in file d:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_2688.trc:
ORA-00600: 内部错误代码, 参数: [2252], [2834], [4076604085], [], [], [], [], []

通过这些日志可以看出数据库一直在报ORA-00600[2252],在后来因某种原因数据库异常重启后启动不了.仔细观察可以发现系统显示的时间是2010年6月7日和当前时间相差了整整两年.
At any point in time, the Oracle Database calculates a “not to exceed” limit for the number of SCNs a database can have used, based on the number of seconds elapsed since 1988, multiplied by 16,384. This is known as the database’s current maximum SCN limit. Doing this ensures that Oracle Databases will ration SCNs over time, allowing over 500 years of data processing for any Oracle Database.

错误原因
根据错误提示计算scn(2834为现在系统的SCN WRAP,4076604085就是BASE)=2834*2^32+4076604085=12176013920949
根据数据库日志显示系统时间计算最大scn值:

SQL>select to_number( ((to_date('20100607 09:19:28','yyyymmdd hh24:mi:ss')-
  2 to_date('19880101','yyyymmdd'))*24*3600*16*1024),'999999999999999999') max_scn from dual;
   MAX_SCN
--------------
11598377254912

通过这里的计算可以知道数据库当前的SCN大于系统时间点上允许的最大时间的SCN,从而出现ORA-00600[2252]错误.

解决方法
知道了数据库报该错误的原因,那么解决该问题很简单,修改系统时间到正确的时间点即可

SQL> select to_number(((sysdate-to_date('19880101','yyyymmdd'))*24*3600*16*1024),
   2 '999999999999999999') max_scn from dual;
   MAX_SCN
--------------
12634899464192

该SCN大于数据库当前SCN所有数据库不会报ORA-00600[2252]错误可以正常启动.

记录另一起ORA-00600[13013]处理

发现ORA-00600[13013]错误
During the execution of an UPDATE statement, after several attempts (Arg [a] passcount) we are unable to get a stable set of rows that conform to the WHERE clause.

Fri Jun  1 03:00:33 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_m000_12104.trc:
ORA-00600: internal error code, arguments: [13013], [5001], [8943], [12596577], [25], [12596578], [17], []
Sat Jun  2 03:01:05 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_m000_1052.trc:
ORA-00600: internal error code, arguments: [13013], [5001], [8943], [12596577], [25], [12596578], [17], []
Sun Jun  3 15:00:50 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_m000_13876.trc:
ORA-00600: internal error code, arguments: [13013], [5001], [8943], [12596577], [25], [12603219], [17], []
Mon Jun  4 03:01:05 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_m000_7704.trc:
ORA-00600: internal error code, arguments: [13013], [5001], [8943], [12596577], [25], [12596578], [17], []
Tue Jun  5 03:00:35 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_m000_27983.trc:
ORA-00600: internal error code, arguments: [13013], [5001], [8943], [12596577], [25], [12596578], [17], []
Wed Jun  6 03:01:07 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_m000_19204.trc:
ORA-00600: internal error code, arguments: [13013], [5001], [8943], [12596577], [25], [12596578], [17], []
Thu Jun  7 03:00:37 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_m000_7273.trc:
ORA-00600: internal error code, arguments: [13013], [5001], [8943], [12596577], [25], [12605556], [17], []

以前处理过一次ORA-600[13013],里面包含了各参数含义,这次也按照常规方法处理,分析如下:
1.通过trace文件找出对应表

*** 2012-06-01 03:00:33.325
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [13013], [5001], [8943], [12596577], [25], [12596578], [17], []
Current SQL statement for this session:
UPDATE WRH$_SQL_BIND_METADATA SET snap_id = :lah_snap_id  WHERE dbid = :dbid    AND (SQL_ID)
IN (SELECT STR1_KEWRATTR FROM X$KEWRATTRSTALE)

2.通过ORA-600[13013]中表示rdba参数找出表

SQL> select  DBMS_UTILITY.data_block_address_file (12596577) "file#",
  2  DBMS_UTILITY.data_block_address_block (12596577) "block#"
  3  from dual;
     file#     block#
---------- ----------
         3      13665
SQL> select * from dba_extents where 13665 between block_id and block_id + blocks and file_id=3;
OWNER      SEGMENT_NAME           SEGMENT_TYPE     EXTENT_ID    FILE_ID   BLOCK_ID    BLOCKS
---------- --------------------  --------------- --------------- ---------- -------  ------
SYS        SYS_LOB0000008933C00  LOBSEGMENT           7          3      13657         8
SYS        WRH$_SQL_BIND_METADA  TABLE                1          3      13665         8

检查对象WRH$_SQL_BIND_METADA是否有坏块或者表和index不一致

SQL> analyze table SYS.WRH$_SQL_BIND_METADATA validate structure cascade online;
Table analyzed.

这里分析WRH$_SQL_BIND_METADA表正常,但是通过上面的查询证明WRH$_SQL_BIND_METADA的第一个extent的第一个数据块上可能出现问题,使得analyze未检查(自己猜猜,未做深入验证).针对这个问题,直接备份WRH$_SQL_BIND_METADATA表,truncate掉该表,然后重新插入数据(注意操作时间避开awr插入数据时间段)

create table SQL_BIND_METADATA_BAK
AS
SELECT * FROM SYS.WRH$_SQL_BIND_METADATA;
TRUNCATE TABLE SYS.WRH$_SQL_BIND_METADATA;
INSERT INTO SYS.WRH$_SQL_BIND_METADATA
SELECT * FROM SQL_BIND_METADATA_BAK;
DROP TABBLE SQL_BIND_METADATA_BAK PURGE;

ORA-600[6749] 发生在 SYSMAN.MGMT_METRICS_RAW表

数据库alert日志长时间出现ORA-00600[6749]错误
日志报错如下

Fri Jun  1 12:01:30 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_396.trc:
ORA-00600: internal error code, arguments: [6749], [3], [12596882], [49], [], [], [], []
Fri Jun  1 12:01:34 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_396.trc:
Fri Jun  1 13:01:06 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_13226.trc:
ORA-00600: internal error code, arguments: [6749], [3], [12596882], [49], [], [], [], []
Fri Jun  1 13:01:10 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_13226.trc:
Fri Jun  1 14:01:46 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_26691.trc:
ORA-00600: internal error code, arguments: [6749], [3], [12596882], [49], [], [], [], []
Fri Jun  1 14:01:51 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_26691.trc:
Fri Jun  1 15:01:21 2012
Errors in file /opt/oracle/admin/oraapp/bdump/oraapp_j000_7119.trc:
ORA-00600: internal error code, arguments: [6749], [3], [12596882], [49], [], [], [], []

查看trace日志

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /opt/oracle/product/10/oraapp
System name:	Linux
Node name:	oracle2
Release:	2.6.18-92.el5
Version:	#1 SMP Tue Apr 29 13:16:15 EDT 2008
Machine:	x86_64
Instance name: oraapp
Redo thread mounted by this instance: 1
Oracle process number: 44
Unix process pid: 26691, image: oracle@oracle2 (J000)
*** ACTION NAME:(target 5) 2012-06-01 14:01:00.298
*** MODULE NAME:(Oracle Enterprise Manager.rollup) 2012-06-01 14:01:00.298
*** SERVICE NAME:(SYS$USERS) 2012-06-01 14:01:00.298
*** SESSION ID:(406.24103) 2012-06-01 14:01:00.298
Dumping current redo log in thread 1
DUMP OF REDO FROM FILE '/opt/oracle/oradata/oraapp/systable/redo03.log'
 Opcodes 11.*
 DBAs (file#, block#):
      (3, 13970)
 RBAs: 0x000000.00000000.0000 thru 0xffffffff.ffffffff.ffff
 SCNs: scn: 0x0000.00000000 thru scn: 0xffff.ffffffff
 Times: creation thru eternity
 FILE HEADER:
	Compatibility Vsn = 169869568=0xa200100
	Db ID=1462349529=0x5729aed9, Db Name='ORAAPP'
	Activation ID=1462334681=0x572974d9
	Control Seq=2614156=0x27e38c, File size=245760=0x3c000
	File Number=3, Blksiz=512, File Type=2 LOG
 descrip:"Thread 0001, Seq# 0000003963, SCN 0x0000129fc9df-0xffffffffffff"

猜测ORA-600[6749]部分参数

SQL> select  DBMS_UTILITY.data_block_address_file (12596882) "file#",
  2  DBMS_UTILITY.data_block_address_block (12596882) "block#"
  3  from dual;
     file#     block#
---------- ----------
         3      13970

ORA-600[6749][a][b]{c}
这里证明c表示rdba

根据dba查询对象

SQL> select * from dba_extents where 13970 between block_id and block_id + blocks and file_id=3;
OWNER      SEGMENT_NAME                     SEGMENT_TYPE
---------- ------------------------------- -------------------
SYSMAN        SYS_IOT_OVER_10448                 TABLE
SQL> select owner,iot_name from dba_tables where table_name = 'SYS_IOT_OVER_10448';
OWNER                          IOT_NAME
------------------------------ ------------------------------
SYSMAN                         MGMT_METRICS_RAW
SQL>  ANALYZE TABLE SYSMAN.MGMT_METRICS_RAW  VALIDATE STRUCTURE CASCADE;
Table analyzed.

按照常理ORA-00600[6749]错误是因为坏块或者表和索引数据不一致导致,通过ANALYZE可以检查出来.这里显示正常,那可能是其他原因导致,查询MOS果然发现是ORA-600 [6749] Occurring on SYSMAN.MGMT_METRICS_RAW [ID 467439.1]

解决方法

The following workaround may resolve the problem temporarily:
1. Ensure you have a good backup before proceeding.
2. Create a copy of the SYSMAN.MGMT_METRICS_RAW table:
SQL> create table SYSMAN.MGMT_METRICS_RAW_COPY
as select * from SYSMAN.MGMT_METRICS_RAW;
3. Truncate the table:
SQL> truncate table SYSMAN.MGMT_METRICS_RAW;
May need  to disable trigger: "sysman.raw_metrics_after_insert" before proceeding.
Re-enable after the insert.
4. Re-insert the rows:
SQL> insert into SYSMAN.MGMT_METRICS_RAW
select * from SYSMAN.MGMT_METRICS_RAW_COPY;
SQL> commit;
5. Drop the copy table:
SQL> drop table SYSMAN.MGMT_METRICS_RAW_COPY;

DEFERRED_SEGMENT_CREATION 参数相关说明

DEFERRED_SEGMENT_CREATION specifies the semantics of deferred segment creation. If set to true, then segments for tables and their dependent objects (LOBs, indexes) will not be created until the first row is inserted into the table.
这句话的意思是 DEFERRED_SEGMENT_CREATION 参数的作用是:创建表的时候延迟创建这个表相关的segment(包括lobs,indexes),直到第一次插入数据的时候才创建segment.补充说明:DEFERRED_SEGMENT_CREATION 参数从11.2.0.1引进,默认值为true;如果要使其恢复老版本功能,设置该参数为false.

DEFERRED_SEGMENT_CREATION默认值

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> select to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') "WWW.XIFENFEI.COM" FROM DUAL;
WWW.XIFENFEI.COM
--------------------------------------
2012-06-01 05:31:03
SQL> show parameter DEFERRED_SEGMENT_CREATION;
NAME                                 TYPE       VALUE
------------------------------------ ---------- --------
deferred_segment_creation            boolean    TRUE

DEFERRED_SEGMENT_CREATION效果验证

SQL> create table t_xifenfei (id number,name varchar2(30));
Table created.
SQL> create index ind_t_xifenfei on t_xifenfei(id);
Index created.
SQL> select segment_name,segment_type from dba_segments where
  2   segment_name in('T_XIFENFEI','IND_T_XIFENFEI') AND OWNER='CHF';
no rows selected
--未创建segment
SQL> INSERT INTO T_XIFENFEI VALUES(1,'WWW.XIFENFEI.COM');
1 row created.
SQL> commit;
Commit complete.
SQL> select segment_name,segment_type from dba_segments where
  2  segment_name in('T_XIFENFEI','IND_T_XIFENFEI') AND OWNER='CHF';
SEGMENT_NAME         SEGMENT_TYPE
-------------------- ------------------------------------
IND_T_XIFENFEI       INDEX
T_XIFENFEI           TABLE
--创建segment
SQL> alter session set deferred_segment_creation=false;
Session altered.
SQL> create table t_xifenfei_2 (id number,name varchar2(30));
Table created.
SQL> select segment_name,segment_type from dba_segments where segment_name='T_XIFENFEI_2';
SEGMENT_NAME         SEGMENT_TYPE
-------------------- ------------------------------------
T_XIFENFEI_2         TABLE
--创建segment

问题1(朋友疑惑为什么它没有给相关表空间分配配额但是创建表成功)

SQL> create user xifenfei identified by xifenfei default tablespace users;
User created.
SQL> grant connect,resource to xifenfei;
Grant succeeded.
SQL> revoke unlimited tablespace from xifenfei;
Revoke succeeded.
SQL> alter user xifenfei quota unlimited on users;
User altered.
SQL> conn xifenfei/xifenfei
Connected.
SQL> create table t_xifenfei (id number,name varchar2(30)) tablespace system;
Table created.
--在system表空间无配额,但是创建表成功
SQL> insert into t_xifenfei values(1,'www.xifenfei.com');
insert into t_xifenfei values(1,'www.xifenfei.com')
            *
ERROR at line 1:
ORA-01950: no privileges on tablespace 'SYSTEM'
--插入数据库失败,因为在system上创建segment失败
SQL> alter session set deferred_segment_creation=false;
Session altered.
SQL> create table t_xifenfei_2 (id number,name varchar2(30)) tablespace system;
create table t_xifenfei_2 (id number,name varchar2(30)) tablespace system
*
ERROR at line 1:
ORA-01950: no privileges on tablespace 'SYSTEM'
--deferred_segment_creation设置为false后,创建表直接失败

问题2(exp未导segment不存在表)
该问题帮朋友解决过.因为暂时无11.2.0.1版本数据库,直接摘录MOS

In 11.2 the deferred storage segment feature is enabled by default.
Conventional export (exp) silently skips tables with deferred segment
creation if no segment has yet been created. ie: If the table does
not yet contain any rows. In some cases "exp" will report EXP-11 for
the table.
eg:
 create table t(c1 int) tablespace sysaux;
 select segment_created from user_tables where table_name='T';
  SEG
  ---
  NO
 Table level export:
   exp scott/tiger file=/tmp/scott.dmp tables=t
   ^
   EXP-11 SCOTT.T does not exist
 Schema level export:
   exp scott/tiger file=/tmp/scott.dmp owner=scott statistics=none
   ^
   Export completes successfully but silently does not export
   table "T".
Rediscovery Notes:
  Tables that may be affected by this can be found thus:
    select owner, table_name from dba_tables
     where segment_created='NO';
  EXP-11 on export for tables with no data.
  Tables missing after exp/imp
Workaround
  Re-create the missing table at the export site from DDL.
  (the table did not contain rows otherwise it would have
   had a segment created for it)
In 11.2 the deferred storage segment feature is enabled by default.
Conventional export (exp) silently skips tables with deferred segment
creation if no segment has yet been created. ie: If the table does
not yet contain any rows. In some cases "exp" will report EXP-11 for
the table.
eg:
 create table t(c1 int) tablespace sysaux;
 select segment_created from user_tables where table_name='T';
  SEG
  ---
  NO
 Table level export:
   exp scott/tiger file=/tmp/scott.dmp tables=t
   ^
   EXP-11 SCOTT.T does not exist
 Schema level export:
   exp scott/tiger file=/tmp/scott.dmp owner=scott statistics=none
   ^
   Export completes successfully but silently does not export
   table "T".
Rediscovery Notes:
  Tables that may be affected by this can be found thus:
    select owner, table_name from dba_tables
     where segment_created='NO';
  EXP-11 on export for tables with no data.
  Tables missing after exp/imp
Workaround
  Re-create the missing table at the export site from DDL.
  (the table did not contain rows otherwise it would have
   had a segment created for it)
This issue is fixed in
•12.1 (Future Release)
•11.2.0.2 (Server Patch Set)

动态修改PGA_AGGREGATE_TARGET 导致ORA-600[723]

在以前分析过ORA-600[729](SGA内存泄露),这次遇到ORA-600[723](PGA内存泄露)
操作系统数据库信息

ORACLE V9.2.0.3.0 - Production vsnsta=0
vsnsql=12 vsnxtr=3
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Instance name: dsdata

alert报错ORA-600[723]

Tue Jun 05 12:16:35 2012
Shutting down instance: further logons disabled
Shutting down instance (immediate)
License high water mark = 274
Tue Jun 05 12:16:40 2012
alter database close normal
Tue Jun 05 12:16:40 2012
SMON: disabling tx recovery
SMON: disabling cache recovery
Tue Jun 05 12:16:40 2012
Shutting down archive processes
Archiving is disabled
Tue Jun 05 12:16:40 2012
ARCH shutting down
Tue Jun 05 12:16:40 2012
ARCH shutting down
ARC1: Archival stopped
Tue Jun 05 12:16:40 2012
ARC0: Archival stopped
Tue Jun 05 12:16:40 2012
Thread 1 closed at log sequence 406
Successful close of redo thread 1.
Tue Jun 05 12:16:41 2012
Completed: alter database close normal
Tue Jun 05 12:16:41 2012
alter database dismount
Completed: alter database dismount
ARCH: Archiving is disabled
Shutting down archive processes
Archiving is disabled
Archive process shutdown avoided: 0 active
ARCH: Archiving is disabled
Shutting down archive processes
Archiving is disabled
Archive process shutdown avoided: 0 active
Tue Jun 05 12:16:43 2012
Errors in file d:\oracle\admin\dsdata\udump\dsdata_ora_504.trc:
ORA-00600: internal error code, arguments: [723], [20664], [20664], [memory leak], [], [], [], []

通过alert日志可以知道,数据库shutdown immediate的时候报ORA-600[723]

分析trace文件

…………
EXTENT 147 addr=062ACCBC
  Chunk  62accc4 sz=     1252    free      "               "
  Chunk  62ad1a8 sz=     2060    freeable  "qesmmaLogInitia"
  Chunk  62ad9b4 sz=     2060    freeable  "qesmmaLogInitia"
  Chunk  62ae1c0 sz=     2060    freeable  "qesmmaLogInitia"
  Chunk  62ae9cc sz=     2060    freeable  "qesmmaLogInitia"
…………
EXTENT 153 addr=04232414
  Chunk  423241c sz=     4476    perm      "perm           "  alo=2868
  Chunk  4233598 sz=    18516    free      "               "
  Chunk  4237dec sz=     2060    freeable  "qesmmaLogInitia"
  Chunk  42385f8 sz=     2060    freeable  "qesmmaLogInitia"
  Chunk  4238e04 sz=     2060    freeable  "qesmmaLogInitia"
  Chunk  4239610 sz=     2060    freeable  "qesmmaLogInitia"
  Chunk  4239e1c sz=     2060    freeable  "qesmmaLogInitia"
…………
--查询发现没有释放的内容都是在qesmmaLogInitia部分
*** 2012-06-05 12:16:43.000
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [723], [20664], [20664], [memory leak], [], [], [], []
Current SQL information unavailable - no SGA.
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
_ksedmp+147          CALLrel  _ksedst+0
_ksfdmp.108+e        CALLrel  _ksedmp+0            3
_kgeriv+89           CALLreg  00000000             217190 3
_kgesiv+4e           CALLrel  _kgeriv+0            217190 0 2D3 3 418FC2C
_ksesic3+3b          CALLrel  _kgesiv+0            217190 0 2D3 3 418FC2C 2D3 3
                                                   418FC2C
__VInfreq__ksmdpg+e  CALLrel  _ksesic3+0           2D3 0 50B8 0 50B8 1 B 26A3F28
f
_opidcl+1db          CALLrel  _ksmdpg+0
_opidrv+3bf          CALLrel  _opidcl+0            21D328 0
_sou2o+19            CALLrel  _opidrv+0
_opimai+150          CALLrel  _sou2o+0             418FE20 32 0 0
_BackgroundThreadSt  CALLrel  _opimai+0
art@4+164
77E6482C             CALLreg  00000000
--------------------- Binary Stack Dump ---------------------

通过查询MOS发现[ID 242260.1]上的Stack Trace比较匹配.上面说到通过sql来直接修改pga_aggregate_target导致,查找alert日志,果然发现:

Mon May 21 15:18:33 2012
ALTER SYSTEM SET pga_aggregate_target='1048576000' SCOPE=MEMORY;
Mon May 21 15:18:33 2012
ALTER SYSTEM SET pga_aggregate_target='1048576000' SCOPE=SPFILE;

现在基本上可以确定引起整个ORA-600[723]的原因是:用户直接修改pga_aggregate_target参数,然后关闭数据库引起Bug:2975617导致

处理建议
Don’t alter the pga_aggregate_target dynamically Change it in init.ora file
针对本库,再次开启数据库应该处于正常状态(spfile已经修改),无需继续关注该问题.

找出dbv相关dba值在数据文件中对应位置

一个朋友数据库因异常断电,数据库不能正常启动,使用dbv检测错误如下:

C:\Users\XIFENFEI\Downloads>dbv file=users01.dbf end=5
DBVERIFY: Release 11.2.0.3.0 - Production on 星期二 6月 5 18:17:27 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - 开始验证: FILE = C:\USERS\XIFENFEI\DOWNLOADS\USERS01.DBF
页 1 标记为损坏
Corrupt block relative dba: 0x00000001 (file 0, block 1)
Bad header found during dbv:
Data in bad block:
 type: 11 format: 2 rdba: 0x00000000
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000b01
 check value in block header: 0x2f1f
 computed block checksum: 0x0
页 2 标记为损坏
Corrupt block relative dba: 0x00000002 (file 0, block 2)
Bad header found during dbv:
Data in bad block:
 type: 29 format: 2 rdba: 0x01000002
 last change scn: 0x0000.0018c7fa seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xc7fa1d02
 check value in block header: 0x82ca
 computed block checksum: 0x0
页 3 标记为损坏
Corrupt block relative dba: 0x00000003 (file 0, block 3)
Bad header found during dbv:
Data in bad block:
 type: 30 format: 2 rdba: 0x01000003
 last change scn: 0x0000.0018c7fa seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xc7fa1e01
 check value in block header: 0x32c9
 computed block checksum: 0x0
页 4 标记为损坏
Corrupt block relative dba: 0x00000004 (file 0, block 4)
Bad header found during dbv:
Data in bad block:
 type: 30 format: 2 rdba: 0x01000004
 last change scn: 0x0000.00004adc seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4adc1e01
 check value in block header: 0x8199
 computed block checksum: 0x0
页 5 标记为损坏
Corrupt block relative dba: 0x00000005 (file 0, block 5)
Bad header found during dbv:
Data in bad block:
 type: 30 format: 2 rdba: 0x01000005
 last change scn: 0x0000.00004ade seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4ade1e01
 check value in block header: 0xc190
 computed block checksum: 0x0
DBVERIFY - 验证完成
检查的页总数: 5
处理的页总数 (数据): 0
失败的页总数 (数据): 0
处理的页总数 (索引): 0
失败的页总数 (索引): 0
处理的页总数 (其他): 0
处理的总页数 (段)  : 0
失败的总页数 (段)  : 0
空的页总数: 0
标记为损坏的总页数: 5
流入的页总数: 0
加密的总页数        : 0
最高块 SCN            : 0 (0.0)

对于这样的错误,特别是Corrupt block relative dba出现奇怪的提示(file 0),我第一反应就是数据文件header出现了问题.在eygle的耐心帮忙和提示下,使用bbed重现了该错误,并且找出了dbv中两个dba(Corrupt block relative dba和rdba)和bbed中相对应的值.通过实验重现相关结果.

dbv检查无坏块

[oracle@xifenfei tmp]$ dbv file=/u01/oracle/oradata/ora11g/xifenfei02.dbf blocksize=8192
DBVERIFY: Release 11.2.0.3.0 - Production on Fri Jun 1 03:34:46 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/xifenfei02.dbf
Block Checking: DBA = 25179275, Block Type = KTB-managed data block
data header at 0xb526707c
kdbchk: row locked by non-existent transaction
        table=0   slot=0
        lockid=2   ktbbhitc=3
Page 13451 failed with check code 6101
Block Checking: DBA = 25179287, Block Type = KTB-managed data block
data header at 0xb527f064
kdbchk: row locked by non-existent transaction
        table=0   slot=0
        lockid=2   ktbbhitc=2
Page 13463 failed with check code 6101
DBVERIFY - Verification complete
Total Pages Examined         : 15360
Total Pages Processed (Data) : 12932
Total Pages Failing   (Data) : 2
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 291
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2137
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1551567 (0.1551567)

修改数据块rdba_kcbh

BBED> set block 150
        BLOCK#          150
BBED> map
 File: /u01/oracle/oradata/ora11g/xifenfei02.dbf (1)
 Block: 150                                   Dba:0x00400096
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 96 bytes                     @20
 struct kdbh, 14 bytes                      @124
 struct kdbt[1], 4 bytes                    @138
 sb2 kdbr[71]                               @142
 ub1 freespace[910]                         @284
 ub1 rowdata[6994]                          @1194
 ub4 tailchk                                @8188
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x01800096
   ub4 bas_kcbh                             @8        0x00131e6f
   ub2 wrp_kcbh                             @12       0x0000
   ub1 seq_kcbh                             @14       0x01
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x4aae
   ub2 spare3_kcbh                          @18       0x0000
BBED> m /x 00000000 offset 4
 File: /u01/oracle/oradata/ora11g/xifenfei02.dbf (1)
 Block: 150              Offsets:    4 to  515           Dba:0x00400096
------------------------------------------------------------------------
 00000000 6f1e1300 00000104 ae4a0000 01000000 01280100 6f1e1300 00000000
 03003200 90008001 ffff0000 00000000 00000000 00000000 00800000 6c781200
 09001d00 05030000 8c02c000 b4000500 00800000 b0191300 02000900 fc020000
 b011c000 0b011400 00800000 6c1e1300 00000000 00000000 00014700 ffffa000
 2e048e03 8e030000 4700e90b 540cb30c 120d710d d00d2f0e 8e0eed0e 4c0fab0f
 0a106910 c8102711 8611e711 4812aa12 0c136713 c2131f14 7c14df14 4115a015
 ff156416 c9162517 8117de17 3b18a118 07196119 bb19191a 771ad61a 351b941b
 f31b521c b11c101d 6f1dd71d 3f1eaa1e 151f2e04 99040405 6f05da05 4506b006
 1b078607 f1075c08 c7082c09 9109f609 5b0ac00a 250b870b 48004900 4a004b00
 4c004d00 4e004f00 5000ffff 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 2c000e03 53595308 565f245f 4c4f434b
 ff03c20f 5eff0456 49455707 786f0912 12230407 786f0912 12230413 32303131
 2d30392d 31383a31 373a3334 3a303305 56414c49 44014e01 4e014e02 c1022c00
 0e065055 424c4943 0756245f 4c4f434b ff03c20f 5fff0753 594e4f4e 594d0778
 6f091212 23040778 6f091212 23041332 3031312d 30392d31 383a3137 3a33343a
 30330556 414c4944 014e014e 014e02c1 022c000e 03535953 07565f24 4c4f434b
 <32 bytes per line>
BBED> sum apply
Check value for File 1, Block 150:
current = 0x4bb8, required = 0x4bb8

dbv检测数据文件
发现提示坏块文件的rdba就是我们刚刚修改的rdba_kcbh值

[oracle@xifenfei tmp]$ dbv file=/u01/oracle/oradata/ora11g/xifenfei02.dbf blocksize=8192
DBVERIFY: Release 11.2.0.3.0 - Production on Fri Jun 1 03:40:44 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/xifenfei02.dbf
Page 150 is marked corrupt
Corrupt block relative dba: 0x01800096 (file 6, block 150)
Bad header found during dbv:
Data in bad block:
 type: 6 format: 2 rdba: 0x00000000
 last change scn: 0x0000.00131e6f seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x1e6f0601
 check value in block header: 0x4bb8
 computed block checksum: 0x0
Block Checking: DBA = 25179275, Block Type = KTB-managed data block
data header at 0xb52a807c
kdbchk: row locked by non-existent transaction
        table=0   slot=0
        lockid=2   ktbbhitc=3
Page 13451 failed with check code 6101
Block Checking: DBA = 25179287, Block Type = KTB-managed data block
data header at 0xb52c0064
kdbchk: row locked by non-existent transaction
        table=0   slot=0
        lockid=2   ktbbhitc=2
Page 13463 failed with check code 6101
DBVERIFY - Verification complete
Total Pages Examined         : 15360
Total Pages Processed (Data) : 12931
Total Pages Failing   (Data) : 2
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 291
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2137
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1551567 (0.1551567)

修改datafile header的kcvfhrfn值

BBED> set block 1
        BLOCK#          1
BBED> map
 File: /u01/oracle/oradata/ora11g/xifenfei02.dbf (1)
 Block: 1                                     Dba:0x00400001
------------------------------------------------------------
 Data File Header
 struct kcvfh, 860 bytes                    @0
 ub4 tailchk                                @8188
BBED> p kcvfhrfn
ub4 kcvfhrfn                                @368      0x00000006
BBED> m /x 00000000
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/xifenfei02.dbf (1)
 Block: 1                Offsets:  368 to  879           Dba:0x00400001
------------------------------------------------------------------------
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 e3b38c2e 04a91100 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 797c1900 00000000 3e3bba2e
 01000000 11000000 b40e0000 1000ba8a 02000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 0a000a00 0a000100 00000000
 00000000 00000000 02008001 bb050c00 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> sum apply
Check value for File 1, Block 1:
current = 0x1daf, required = 0x1daf

dbv验证数据文件
这里如果验证所有数据文件快,会发现所有类此Corrupt block relative dba: 0x00000001 (file 0, block 1)提示.这里证明datafile header 的kcvfhrfn 影响dbv检查数据文件坏块的一个标准之一

[oracle@xifenfei tmp]$ dbv file=/u01/oracle/oradata/ora11g/xifenfei02.dbf blocksize=8192 start=1 end=4
DBVERIFY: Release 11.2.0.3.0 - Production on Fri Jun 1 03:43:27 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/xifenfei02.dbf
Page 1 is marked corrupt
Corrupt block relative dba: 0x00000001 (file 0, block 1)
Bad header found during dbv:
Data in bad block:
 type: 11 format: 2 rdba: 0x01800001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000b01
 check value in block header: 0x1daf
 computed block checksum: 0x0
Page 2 is marked corrupt
Corrupt block relative dba: 0x00000002 (file 0, block 2)
Bad header found during dbv:
Data in bad block:
 type: 29 format: 2 rdba: 0x01800002
 last change scn: 0x0000.0017accf seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xaccf1d01
 check value in block header: 0x2626
 computed block checksum: 0x0
Page 3 is marked corrupt
Corrupt block relative dba: 0x00000003 (file 0, block 3)
Bad header found during dbv:
Data in bad block:
 type: 30 format: 2 rdba: 0x01800003
 last change scn: 0x0000.0017accf seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xaccf1e01
 check value in block header: 0x4ef4
 computed block checksum: 0x0
Page 4 is marked corrupt
Corrupt block relative dba: 0x00000004 (file 0, block 4)
Bad header found during dbv:
Data in bad block:
 type: 30 format: 2 rdba: 0x01800004
 last change scn: 0x0000.00119bf4 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9bf41e01
 check value in block header: 0x810a
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Pages Examined         : 4
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 0
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 0
Total Pages Marked Corrupt   : 4
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 0 (0.0)

总结说明
1.dbv检测数据文件坏块的时候会读取数据文件头的kcvfhrfn值,如果这个值出现问题,可能导致数据文件中的所有数据块都异常,具体表现就是Corrupt block relative dba项异常
2.dbv检查数据文件坏块中显示的rdba对应于数据块的rdba_kcbh值

忘记执行end backup命令数据库恢复

遇到两次begin backup忘记end backup导致的悲剧.虽然不是自己亲身经历,但是感触很深,这里做了一个小实验,说明在begin backup后忘记end backup,而又丢失了备份归档日志,且数据库异常重启的事故恢复(这里为了加大实验难道,并且使用begin backup命令后的热备文件恢复)
模拟begin end

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/xifenfei/archive
Oldest online log sequence     37
Next log sequence to archive   39
Current log sequence           39
SQL> alter tablespace bbed begin backup;
Tablespace altered.
SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/xifenfei/archive
Oldest online log sequence     37
Next log sequence to archive   39
Current log sequence           39
SQL> drop table chf.t_xff;
Table dropped.
SQL> create table chf.t_xff
  2    as
  3  select * from dba_objects;
Table created.
SQL> alter system switch logfile;
System altered.
SQL>   delete from chf.t_XFF;
30811 rows deleted.
SQL>   commit;
Commit complete.
SQL>   alter system  switch logfile;
System altered.
SQL> /
System altered.
SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/xifenfei/archive
Oldest online log sequence     40
Next log sequence to archive   42
Current log sequence           42

cp备份文件

[oracle@xifenfei xifenfei]$ cp bbed01.dbf bbed01.dbf_05
[oracle@xifenfei xifenfei]$ cp bbed02.dbf bbed02.dbf_05

继续操作数据库

SQL> alter system switch logfile;
System altered.
SQL>   insert into chf.t_xff
  2    select * from dba_objects;
30811 rows created.
SQL> commit;
Commit complete.
SQL>  archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/xifenfei/archive
Oldest online log sequence     41
Next log sequence to archive   43
Current log sequence           43
SQL> alter system switch logfile;
System altered.

模拟异常关闭数据库

SQL> shutdown immediate;
ORA-01149: cannot shutdown - file 11 has online backup set
ORA-01110: data file 11: '/u01/oracle/oradata/xifenfei/bbed01.dbf'
SQL> shutdown abort;
ORACLE instance shut down.

删除部分归档日志(模拟归档日志丢失)

[oracle@xifenfei archive]$ mv 1_39.dbf 1_39.dbf_bak
[oracle@xifenfei archive]$ mv 1_40.dbf 1_40.dbf_bak

启动数据库

[oracle@xifenfei xifenfei]$ sqlplus "/ as sysdba"
SQL*Plus: Release 9.2.0.4.0 - Production on Tue Jun 5 03:02:56 2012
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area  353441008 bytes
Fixed Size                   451824 bytes
Variable Size             184549376 bytes
Database Buffers          167772160 bytes
Redo Buffers                 667648 bytes
Database mounted.
ORA-01113: file 11 needs media recovery
ORA-01110: data file 11: '/u01/oracle/oradata/xifenfei/bbed01.dbf'

分析相关SCN

SQL> select file#,online_status "STATUS",to_char(change#,'9999999999999999') "SCN",
  2  To_char(time,'yyyy-mm-dd hh24:mi:ss')"TIME" from v$recover_file;
     FILE# STATUS  SCN               TIME
---------- ------- ----------------- -------------------
        11 ONLINE     12286828683164 2012-06-05 02:55:43
        12 ONLINE     12286828683164 2012-06-05 02:55:43
SQL> select file#,to_char(checkpoint_change#,'999999999999999') "SCN",
  2  to_char(last_change#,'999999999999999')"STOP_SCN" from v$datafile;
     FILE# SCN              STOP_SCN
---------- ---------------- ----------------
         1   12286828684636
         2   12286828684636
         3   12286828684636
         4   12286828684636
         5   12286828684636
         6   12286828684636
         7   12286828684636
         8   12286828684636
         9   12286828684636
        10   12286828684636
        11   12286828683164
        12   12286828683164
12 rows selected.
SQL> select file#,to_char(checkpoint_change#,'9999999999999999') "SCN",
  2  to_char(RESETLOGS_CHANGE#,'9999999999999999') "RESETLOGS SCN"
  3  from v$datafile_header;
     FILE# SCN               RESETLOGS SCN
---------- ----------------- -----------------
         1    12286828684636            174968
         2    12286828684636            174968
         3    12286828684636            174968
         4    12286828684636            174968
         5    12286828684636            174968
         6    12286828684636            174968
         7    12286828684636            174968
         8    12286828684636            174968
         9    12286828684636            174968
        10    12286828684636            174968
        11    12286828683164            174968
        12    12286828683164            174968
12 rows selected.
SQL> select file#,to_char(CHANGE#,'9999999999999999') "SCN",
  2  to_char(TIME,'yyyy-mm-dd hh24:mi:ss') "TIME" from v$backup;
     FILE# SCN               TIME
---------- ----------------- -------------------
         1                 0
         2                 0
         3                 0
         4                 0
         5                 0
         6                 0
         7                 0
         8                 0
         9                 0
        10                 0
        11    12286828683164 2012-06-05 02:55:43
        12    12286828683164 2012-06-05 02:55:43
12 rows selected.

发现数据库未end backup

Tue Jun  5 02:55:43 2012
alter tablespace bbed begin backup
Tue Jun  5 02:55:43 2012
Completed: alter tablespace bbed begin backup

尝试end backup
出现这个错误是正常的,因为我替换回来的bbed表空间数据文件的版本信息可能和控制文件的不一致,解决方法是重建控制文件

SQL> alter tablespace bbed end backup;
alter tablespace bbed end backup
*
ERROR at line 1:
ORA-01235: END BACKUP failed for 2 file(s) and succeeded for 0
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: '/u01/oracle/oradata/xifenfei/bbed02.dbf'
ORA-01208: data file is an old version - not accessing current version
ORA-01122: database file 11 failed verification check
ORA-01110: data file 11: '/u01/oracle/oradata/xifenfei/bbed01.dbf'
ORA-01208: data file is an old version - not accessing current version

重建控制文件

SQL> shutdown abort;
ORACLE instance shut down.
SQL> STARTUP NOMOUNT
Total System Global Area  353441008 bytes
Fixed Size                   451824 bytes
Variable Size             184549376 bytes
Database Buffers          167772160 bytes
Redo Buffers                 667648 bytes
SQL>@ctl.sql
Control file created.

尝试恢复数据库

SQL> recover database;
ORA-00279: change 12286828683164 generated at 06/05/2012 02:55:43 needed for
thread 1
ORA-00289: suggestion : /u01/oracle/oradata/xifenfei/archive/1_39.dbf
ORA-00280: change 12286828683164 for thread 1 is in sequence #39
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00308: cannot open archived log
'/u01/oracle/oradata/xifenfei/archive/1_39.dbf'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
ORA-00308: cannot open archived log
'/u01/oracle/oradata/xifenfei/archive/1_39.dbf'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3

执行end backup

SQL> alter tablespace bbed end backup;
Tablespace altered.

再次查看相关SCN
可以发现end backup之后,datafile header 的scn发生了改变,说明begin backup主要是冻住了datafile header scn

SQL> select file#,online_status "STATUS",to_char(change#,'9999999999999999') "SCN",
  2  To_char(time,'yyyy-mm-dd hh24:mi:ss')"TIME" from v$recover_file;
     FILE# STATUS  SCN               TIME
---------- ------- ----------------- -------------------
         1 ONLINE     12286828684636 2012-06-05 03:00:46
         2 ONLINE     12286828684636 2012-06-05 03:00:46
         3 ONLINE     12286828684636 2012-06-05 03:00:46
         4 ONLINE     12286828684636 2012-06-05 03:00:46
         5 ONLINE     12286828684636 2012-06-05 03:00:46
         6 ONLINE     12286828684636 2012-06-05 03:00:46
         7 ONLINE     12286828684636 2012-06-05 03:00:46
         8 ONLINE     12286828684636 2012-06-05 03:00:46
         9 ONLINE     12286828684636 2012-06-05 03:00:46
        10 ONLINE     12286828684636 2012-06-05 03:00:46
        11 ONLINE     12286828683821 2012-06-05 02:56:26
        12 ONLINE     12286828683821 2012-06-05 02:56:26
12 rows selected.
SQL> select file#,to_char(checkpoint_change#,'999999999999999') "SCN",
  2  to_char(last_change#,'999999999999999')"STOP_SCN" from v$datafile;
     FILE# SCN              STOP_SCN
---------- ---------------- ----------------
         1   12286828684636
         2   12286828684636
         3   12286828684636
         4   12286828684636
         5   12286828684636
         6   12286828684636
         7   12286828684636
         8   12286828684636
         9   12286828684636
        10   12286828684636
        11   12286828684636
        12   12286828684636
12 rows selected.
SQL> select file#,to_char(checkpoint_change#,'9999999999999999') "SCN",
  2  to_char(RESETLOGS_CHANGE#,'9999999999999999') "RESETLOGS SCN"
  3  from v$datafile_header;
     FILE# SCN               RESETLOGS SCN
---------- ----------------- -----------------
         1    12286828684636            174968
         2    12286828684636            174968
         3    12286828684636            174968
         4    12286828684636            174968
         5    12286828684636            174968
         6    12286828684636            174968
         7    12286828684636            174968
         8    12286828684636            174968
         9    12286828684636            174968
        10    12286828684636            174968
        11    12286828683821            174968
        12    12286828683821            174968
12 rows selected.

再次尝试恢复数据库

SQL> recover database;
ORA-00279: change 12286828683821 generated at 06/05/2012 02:56:26 needed for
thread 1
ORA-00289: suggestion : /u01/oracle/oradata/xifenfei/archive/1_41.dbf
ORA-00280: change 12286828683821 for thread 1 is in sequence #41
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
auto
Log applied.
Media recovery complete.
SQL> alter database open;
Database altered.

总结说明
在数据库忘记end backup,而又被异常重启数据库时候,会提示你需要恢复.这个时候如果你有所有的归档日志,那没有任何问题,直接recover就可以了.如果因为begin backup命令执行比较久,部分归档日志丢失,这个时候不能直接recover,可以先尝试end backup,然后在recover.如果在这个时候还发现有部分日志不存在,那只能考虑bbed修改datafile header的scn.
温馨提醒:各位dba在执行begin backup之后一定要记得end backup

ORACLE最大可以存储多少数据量

今天有朋友在群里面讨论oracle数据库最大可以存储的数据大小,下面根据官方文档提供的相关限制,大概估算出来oracle数据库最多可以存储的数据量

Physical Database Limits(11.2)

Item

Type of Limit

Limit Value

Database Block Size

Minimum

2048 bytes; must be a multiple of operating system physical block size

Database Block Size

Maximum

Operating system dependent; never more than 32 KB

Database Blocks

Minimum in initial extent of a segment

2 blocks

Database Blocks

Maximum per datafile

Platform dependent; typically 222 – 1 blocks

Controlfiles

Number of control files

1 minimum; 2 or more (on separate devices) strongly recommended

Controlfiles

Size of a control file

Dependent on operating system and database creation options; maximum of25,000 x (database block size)

Database files

Maximum per tablespace

Operating system dependent; usually 1022

Database files

Maximum per database

65533

May be less on some operating systems

Limited also by size of database blocks and by the DB_FILES initialization parameter for a particular instance

Database extents

Maximum per dictionary managed tablespace

4 GB * physical block size (with K/M modifier); 4 GB (without K/M modifier)

Database extents

Maximum per locally managed (uniform) tablespace

2 GB * physical block size (with K/M modifier); 2 GB (without K/M modifier)

Database file size

Maximum

Operating system dependent. Limited by maximum operating system file size; typically 222 or 4 MB blocks

MAXEXTENTS

Default value

Derived from tablespace default storage or DB_BLOCK_SIZE initialization parameter

MAXEXTENTS

Maximum

Unlimited

Redo Log Files

Maximum number of logfiles

Limited by value of MAXLOGFILES parameter in the CREATE DATABASE statement

Control file can be resized to allow more entries; ultimately an operating system limit

Redo Log Files

Maximum number of logfiles per group

Unlimited

Redo Log File Size

Minimum size

4 MB

Redo Log File Size

Maximum Size

Operating system limit; typically 2 GB

Tablespaces

Maximum number per database

64 K

Number of tablespaces cannot exceed the number of database files because each tablespace must include at least one file

Bigfile Tablespaces

Number of blocks

A bigfile tablespace contains only one datafile or tempfile, which can contain up to approximately 4 billion ( 232 ) blocks. The maximum size of the single datafile or tempfile is 128 terabytes (TB) for a tablespace with 32 K blocks and 32 TB for a tablespace with 8 K blocks.

Smallfile (traditional) Tablespaces

Number of blocks

A smallfile tablespace is a traditional Oracle tablespace, which can contain 1022 datafiles or tempfiles, each of which can contain up to approximately 4 million (222) blocks.

External Tables file

Maximum size

Dependent on the operating system.

An external table can be composed of multiple files.

通过这里的相关限制可以大概的技术出来oracle数据库在传统数据文件和大数据文件情况下最大大小分别是:

传统数据文件(Smallfile)

32*1024(数据块大小)* (222–1)(一个数据文件的数据块数)*65533

(数据库中最多数据文件个数)= 9006784790495232(byte)/1024/1024/1024/1024=8191.6P
如果按照我们常用的block_size=8k,那么我们的数据库可以存储大小为2047.9P
大数据文件(Bigfile)

32*1024(数据块大小)* (232– 1)(一个数据文件的数据块数)*65533

(数据库中最多数据文件个数) = 9222949822242324480 (byte)/1024/1024/1024/1024 =8589541374P
如果按照我们常用的block_size=8k,那么我们的数据库可以存储大小为2147385343.5P
参考:http://docs.oracle.com/cd/E11882_01/server.112/e25513/limits002.htm