时间不同步导致ogg部署异常

一、错误检查

[oracle@srtcreen ~]$ ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 11.1.1.1.1 OGGCORE_11.1.1.1.1_PLATFORMS_110729.1700
Linux, x64, 64bit (optimized), Oracle 10g on Jul 29 2011 19:43:29
Copyright (C) 1995, 2011, Oracle and/or its affiliates. All rights reserved.
GGSCI (srtcreen) 1> info all
Program     Status      Group       Lag           Time Since Chkpt
MANAGER     RUNNING
REPLICAT    RUNNING     R_1         00:00:00      00:00:07
REPLICAT    RUNNING     R_2         00:00:00      00:00:05
REPLICAT    ABENDED     R_3         19:19:34      00:17:33
REPLICAT    STOPPED     R_4         00:00:00      19:44:24
REPLICAT    STOPPED     R_5         00:00:00      19:44:13
GGSCI (srtcreen) 2> view report r_3
***********************************************************************
                 Oracle GoldenGate Delivery for Oracle
     Version 11.1.1.1.1 OGGCORE_11.1.1.1.1_PLATFORMS_110729.1700
   Linux, x64, 64bit (optimized), Oracle 10g on Sep 13 2011 21:33:03
Copyright (C) 1995, 2011, Oracle and/or its affiliates. All rights reserved.
                    Starting at 2011-12-02 16:36:58
***********************************************************************
Operating System Version:
Linux
Version #1 SMP Fri Apr 2 14:58:14 EDT 2010, Release 2.6.18-194.el5
Node: srtcreen
Machine: x86_64
                         soft limit   hard limit
Address Space Size   :    unlimited    unlimited
Heap Size            :    unlimited    unlimited
File Size            :    unlimited    unlimited
CPU Time             :    unlimited    unlimited
Process id: 13398
Description:
***********************************************************************
**            Running with the following parameters                  **
***********************************************************************
replicat r_3
ASSUMETARGETDEFS
HANDLECOLLISIONS
SETENV (NLS_LANG =AMERICAN_AMERICA.ZHS16GBK)
Set environment variable (NLS_LANG=AMERICAN_AMERICA.ZHS16GBK)
userid it1ogg, password ********
--file for dicarded transaction --
discardfile /opt/OGG/discard/R_3.txt, append, megabytes 100
DDL
MAP CSCNEW.TAB_CS_CALL_PICKUP, TARGET SRT_CREEN.TAB_CS_CALL_PICKUP, KEYCOLS (CALL_ID);
CACHEMGR virtual memory values (may have been adjusted)
CACHEBUFFERSIZE:                         64K
CACHESIZE:                              512M
CACHEBUFFERSIZE (soft max):               4M
CACHEPAGEOUTSIZE (normal):                4M
PROCESS VM AVAIL FROM OS (min):           1G
CACHESIZEMAX (strict force to disk):    881M
Database Version:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
Database Language and Character Set:
NLS_LANG = "AMERICAN_AMERICA.ZHS16GBK"
NLS_LANGUAGE     = "AMERICAN"
NLS_TERRITORY    = "AMERICA"
NLS_CHARACTERSET = "ZHS16GBK"
For further information on character set settings, please refer to user manual.
***********************************************************************
**                     Run Time Messages                             **
***********************************************************************
Opened trail file /opt/OGG/dirdat/U3000000 at 2011-12-02 16:36:58
MAP resolved (entry CSCNEW.TAB_CS_CALL_PICKUP):
  MAP CSCNEW.TAB_CS_CALL_PICKUP, TARGET SRT_CREEN.TAB_CS_CALL_PICKUP, KEYCOLS (CALL_ID);
Using following columns in default map by name:
  PICKUP_ID, CALL_ID, CALL_SERIAL, USER_ID, PICKUP_TIME, CALL_RESULT,
  FAIL_REASON, CALL_TIME, CALL_DURA, END_TIME
Using the following key columns for target table SRT_CREEN.TAB_CS_CALL_PICKUP: CALL_ID.
2011-12-02 16:36:58  WARNING OGG-00869  OCI Error ORA-01407: cannot update ("SRT_CREEN"."TAB_CS_CALL_PICKUP"."PICKUP_ID") to NULL (status = 1407), SQL <UPDAT
E "SRT_CREEN"."TAB_CS_CALL_PICKUP" SET "PICKUP_ID" = :a0,"CALL_SERIAL" = :a2,"USER_ID" = :a3,"PICKUP_TIME" = :a4,"CALL_RESULT" = :a5,"FAIL_REASON" = :a6,"CAL
L_TIME" = :a7,"CALL_DURA" = :a8,"END_T>.
2011-12-02 16:36:58  WARNING OGG-01004  Aborted grouped transaction on 'SRT_CREEN.TAB_CS_CALL_PICKUP', Database error 1407 (OCI Error ORA-01407: cannot updat
e ("SRT_CREEN"."TAB_CS_CALL_PICKUP"."PICKUP_ID") to NULL (status = 1407), SQL <UPDATE "SRT_CREEN"."TAB_CS_CALL_PICKUP" SET "PICKUP_ID" = :a0,"CALL_SERIAL" =
:a2,"USER_ID" = :a3,"PICKUP_TIME" = :a4,"CALL_RESULT" = :a5,"FAIL_REASON" = :a6,"CALL_TIME" = :a7,"CALL_DURA" = :a8,"END_T>).
2011-12-02 16:36:58  WARNING OGG-01003  Repositioning to rba 924 in seqno 0.
2011-12-02 16:36:59  WARNING OGG-01154  SQL error 1407 mapping CSCNEW.TAB_CS_CALL_PICKUP to SRT_CREEN.TAB_CS_CALL_PICKUP OCI Error ORA-01407: cannot update (
"SRT_CREEN"."TAB_CS_CALL_PICKUP"."PICKUP_ID") to NULL (status = 1407), SQL <UPDATE "SRT_CREEN"."TAB_CS_CALL_PICKUP" SET "PICKUP_ID" = :a0,"CALL_SERIAL" = :a2
,"USER_ID" = :a3,"PICKUP_TIME" = :a4,"CALL_RESULT" = :a5,"FAIL_REASON" = :a6,"CALL_TIME" = :a7,"CALL_DURA" = :a8,"END_T>.
2011-12-02 16:36:59  WARNING OGG-01003  Repositioning to rba 924 in seqno 0.
***********************************************************************
*                   ** Run Time Statistics **                         *
***********************************************************************
Last record for the last committed transaction is the following:
___________________________________________________________________
Trail name :  /opt/OGG/dirdat/U3000000
Hdr-Ind    :     E  (x45)     Partition  :     .  (x04)
UndoFlag   :     .  (x00)     BeforeAfter:     A  (x41)
RecLength  :   157 (x009d)    IO Time    : 2011-12-01 21:17:24.084108
IOType     :    15  (x0f)     OrigNode   :   255  (xff)
TransInd   :     .  (x03)     FormatType :     R  (x52)
SyskeyLen  :     0  (x00)     Incomplete :     .  (x00)
AuditRBA   :       3225       AuditPos   : 42227728
Continued  :     N  (x00)     RecCount   :     1  (x01)
2011-12-01 21:17:24.084108 FieldComp          Len   157 RBA 924
Name: CSCNEW.TAB_CS_CALL_PICKUP
___________________________________________________________________
Reading /opt/OGG/dirdat/U3000000, current RBA 924, 0 records
Report at 2011-12-02 16:36:59 (activity since 2011-12-02 16:36:58)
From Table CSCNEW.TAB_CS_CALL_PICKUP to SRT_CREEN.TAB_CS_CALL_PICKUP:
       #                   inserts:         0
       #                   updates:         0
       #                   deletes:         0
       #                  discards:         1
DDL replication statistics:
                    Operations:         0
             Mapped operations:         0
           Unmapped operations:         0
              Other operations:         0
           Excluded operations:         0
                        Errors:         0
                Retried errors:         0
              Discarded errors:         0
                Ignored errors:         0
Last log location read:
     FILE:      /opt/OGG/dirdat/U3000000
     SEQNO:     0
     RBA:       924
     TIMESTAMP: 2011-12-01 21:17:24.084108
     EOF:       NO
     READERR:   0
2011-12-02 16:36:59  ERROR   OGG-01668  PROCESS ABENDING.
--发现奇怪现象,我2011-12-02早上过来检查这个,发现时间竟然显示2011-12-02 16:36:59,第一反应系统时间错误,继续检查
GGSCI (srtcreen) 2> info r_2
REPLICAT   R_2       Last Started 2011-12-01 21:09   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:05 ago)
Log Read Checkpoint  File /opt/OGG/dirdat/U2000000
                     2011-12-02 17:01:29.927591  RBA 6234
GGSCI (srtcreen) 3> exit
[oracle@srtcreen OGG]$ cd dirdat
[oracle@srtcreen dirdat]$ ll
total 396
-rw-rw-rw- 1 oracle oinstall      0 Dec  1 21:10 U1000000
-rw-rw-rw- 1 oracle oinstall   5984 Dec  2 05:36 U2000000
-rw-rw-rw- 1 oracle oinstall 392258 Dec  2 16:52 U3000000
[oracle@srtcreen dirdat]$ stat U3000000
  File: `U3000000'
  Size: 392258          Blocks: 776        IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 213844220   Links: 1
Access: (0666/-rw-rw-rw-)  Uid: (  501/  oracle)   Gid: (  501/oinstall)
Access: 2011-12-02 16:36:59.000000000 +0800
Modify: 2011-12-02 16:52:55.000000000 +0800
Change: 2011-12-02 16:52:55.000000000 +0800
###############查看源端数据库服务器时间#####################
[oracle@tykf ~]$ date
Fri Dec  2 08:50:10 CST 2011
[oracle@tykf ~]$ ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 11.1.1.0.0 Build 078
Linux, x64, 64bit (optimized), Oracle 10 on Jul 28 2010 13:21:11
Copyright (C) 1995, 2010, Oracle and/or its affiliates. All rights reserved.
GGSCI (tykf) 1> info all
Program     Status      Group       Lag           Time Since Chkpt
MANAGER     RUNNING
EXTRACT     RUNNING     EXT_1       00:00:00      00:00:05
EXTRACT     RUNNING     EXT_2       00:00:00      00:00:01
EXTRACT     RUNNING     EXT_3       00:00:00      00:00:05
EXTRACT     STOPPED     EXT_4       00:00:00      21:59:56
EXTRACT     STOPPED     EXT_5       00:00:00      21:59:28
EXTRACT     RUNNING     P_1         00:00:00      00:00:04
EXTRACT     RUNNING     P_2         00:00:00      00:00:04
EXTRACT     RUNNING     P_3         00:00:00      00:00:05
EXTRACT     STOPPED     P_4         00:00:00      21:56:42
EXTRACT     STOPPED     P_5         00:00:00      21:56:11
GGSCI (tykf) 2> info ext_3
EXTRACT    EXT_3     Last Started 2011-12-01 13:11   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:03 ago)
Log Read Checkpoint  Oracle Redo Logs
                     2011-12-02 08:50:06  Seqno 3233, RBA 32267264
GGSCI (tykf) 3> info p_3
EXTRACT    P_3       Last Started 2011-12-01 13:11   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:00 ago)
Log Read Checkpoint  File /opt/OGG/dirdat/extract/A3000000
                     2011-12-02 08:50:15.000000  RBA 393705
--ext_3和p_3是目标端r_3的对应进程
#######################################################################

果然是系统时间错误,源端和目标端相差了近八个小时

二、更正目标端时间

[oracle@srtcreen dirdat]$ date
Fri Dec  2 16:55:55 CST 2011
[oracle@srtcreen OGG]$ su - root
Password:
[root@srtcreen ~]# date -s 08:58:20
Fri Dec  2  08:58:20 CST
[root@srtcreen ~]# clock -w
[root@srtcreen ~]# date
Fri Dec  2  08:58:28 CST
[root@srtcreen ~]# su - oracle
[oracle@srtcreen dirdat]$ ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 11.1.1.1.1 OGGCORE_11.1.1.1.1_PLATFORMS_110729.1700
Linux, x64, 64bit (optimized), Oracle 10g on Jul 29 2011 19:43:29
Copyright (C) 1995, 2011, Oracle and/or its affiliates. All rights reserved.
GGSCI (srtcreen) 1> info all
Program     Status      Group       Lag           Time Since Chkpt
MANAGER     RUNNING
REPLICAT    RUNNING     R_1         00:00:00      unknown
REPLICAT    RUNNING     R_2         00:00:00      unknown
REPLICAT    ABENDED     R_3         19:19:34      unknown
REPLICAT    STOPPED     R_4         00:00:00      11:54:31
REPLICAT    STOPPED     R_5         00:00:00      11:54:20
[oracle@srtcreen OGG]$ cd dirdat
[oracle@srtcreen dirdat]$ ll
total 412
-rw-rw-rw- 1 oracle oinstall      0 Dec  1 21:10 U1000000
-rw-rw-rw- 1 oracle oinstall   6485 Dec  2  2011 U2000000
-rw-rw-rw- 1 oracle oinstall 407519 Dec  2 09:04 U3000000

最新写到目标端的数据已经是当前修改时间(这个是系统时间,肯定会修改过来)
出现Time Since Chkpt unknown,处理方法见ogg中Time Since Chkpt显示unknown解决

三、重设r_3时间点

GGSCI (srtcreen) 7> alter r_3,begin 2011-12-02 09:00:00
REPLICAT altered.
GGSCI (srtcreen) 9> start r_3
Sending START request to MANAGER ...
REPLICAT R_3 starting
GGSCI (srtcreen) 10> info all
Program     Status      Group       Lag           Time Since Chkpt
MANAGER     RUNNING
REPLICAT    RUNNING     R_1         00:00:00      00:00:05
REPLICAT    RUNNING     R_2         00:00:00      00:00:02
REPLICAT    ABENDED     R_3         unknown       00:00:00
REPLICAT    STOPPED     R_4         00:00:00      11:55:40
REPLICAT    STOPPED     R_5         00:00:00      11:55:29
GGSCI (srtcreen) 11> view report r_3
***********************************************************************
                 Oracle GoldenGate Delivery for Oracle
     Version 11.1.1.1.1 OGGCORE_11.1.1.1.1_PLATFORMS_110729.1700
   Linux, x64, 64bit (optimized), Oracle 10g on Sep 13 2011 21:33:03
Copyright (C) 1995, 2011, Oracle and/or its affiliates. All rights reserved.
                    Starting at 2011-12-02 09:09:31
***********************************************************************
Operating System Version:
Linux
Version #1 SMP Fri Apr 2 14:58:14 EDT 2010, Release 2.6.18-194.el5
Node: srtcreen
Machine: x86_64
                         soft limit   hard limit
Address Space Size   :    unlimited    unlimited
Heap Size            :    unlimited    unlimited
File Size            :    unlimited    unlimited
CPU Time             :    unlimited    unlimited
Process id: 13629
Description:
***********************************************************************
**            Running with the following parameters                  **
***********************************************************************
replicat r_3
ASSUMETARGETDEFS
HANDLECOLLISIONS
SETENV (NLS_LANG =AMERICAN_AMERICA.ZHS16GBK)
Set environment variable (NLS_LANG=AMERICAN_AMERICA.ZHS16GBK)
userid it1ogg, password ********
--file for dicarded transaction --
discardfile /opt/OGG/discard/R_3.txt, append, megabytes 100
DDL
MAP CSCNEW.TAB_CS_CALL_PICKUP, TARGET SRT_CREEN.TAB_CS_CALL_PICKUP, KEYCOLS (CALL_ID);
CACHEMGR virtual memory values (may have been adjusted)
CACHEBUFFERSIZE:                         64K
CACHESIZE:                              512M
CACHEBUFFERSIZE (soft max):               4M
CACHEPAGEOUTSIZE (normal):                4M
PROCESS VM AVAIL FROM OS (min):           1G
CACHESIZEMAX (strict force to disk):    881M
Database Version:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
Database Language and Character Set:
NLS_LANG = "AMERICAN_AMERICA.ZHS16GBK"
NLS_LANGUAGE     = "AMERICAN"
NLS_TERRITORY    = "AMERICA"
NLS_CHARACTERSET = "ZHS16GBK"
For further information on character set settings, please refer to user manual.
***********************************************************************
**                     Run Time Messages                             **
***********************************************************************
Opened trail file /opt/OGG/dirdat/U3000000 at 2011-12-02 09:09:32
MAP resolved (entry CSCNEW.TAB_CS_CALL_PICKUP):
  MAP CSCNEW.TAB_CS_CALL_PICKUP, TARGET SRT_CREEN.TAB_CS_CALL_PICKUP, KEYCOLS (CALL_ID);
Using following columns in default map by name:
  PICKUP_ID, CALL_ID, CALL_SERIAL, USER_ID, PICKUP_TIME, CALL_RESULT,
  FAIL_REASON, CALL_TIME, CALL_DURA, END_TIME
Using the following key columns for target table SRT_CREEN.TAB_CS_CALL_PICKUP: CALL_ID.
2011-12-02 09:09:32  WARNING OGG-00869  OCI Error ORA-01407: cannot update ("SRT_CREEN"."TAB_CS_CALL_PICKUP"."PICKUP_ID") to NULL (status = 1407), SQL <UPDAT
E "SRT_CREEN"."TAB_CS_CALL_PICKUP" SET "PICKUP_ID" = :a0,"CALL_SERIAL" = :a2,"USER_ID" = :a3,"PICKUP_TIME" = :a4,"CALL_RESULT" = :a5,"FAIL_REASON" = :a6,"CAL
L_TIME" = :a7,"CALL_DURA" = :a8,"END_T>.
2011-12-02 09:09:32  WARNING OGG-01004  Aborted grouped transaction on 'SRT_CREEN.TAB_CS_CALL_PICKUP', Database error 1407 (OCI Error ORA-01407: cannot updat
e ("SRT_CREEN"."TAB_CS_CALL_PICKUP"."PICKUP_ID") to NULL (status = 1407), SQL <UPDATE "SRT_CREEN"."TAB_CS_CALL_PICKUP" SET "PICKUP_ID" = :a0,"CALL_SERIAL" =
:a2,"USER_ID" = :a3,"PICKUP_TIME" = :a4,"CALL_RESULT" = :a5,"FAIL_REASON" = :a6,"CALL_TIME" = :a7,"CALL_DURA" = :a8,"END_T>).
2011-12-02 09:09:32  WARNING OGG-01003  Repositioning to rba 375833 in seqno 0.
2011-12-02 09:09:32  WARNING OGG-01154  SQL error 1407 mapping CSCNEW.TAB_CS_CALL_PICKUP to SRT_CREEN.TAB_CS_CALL_PICKUP OCI Error ORA-01407: cannot update (
"SRT_CREEN"."TAB_CS_CALL_PICKUP"."PICKUP_ID") to NULL (status = 1407), SQL <UPDATE "SRT_CREEN"."TAB_CS_CALL_PICKUP" SET "PICKUP_ID" = :a0,"CALL_SERIAL" = :a2
,"USER_ID" = :a3,"PICKUP_TIME" = :a4,"CALL_RESULT" = :a5,"FAIL_REASON" = :a6,"CALL_TIME" = :a7,"CALL_DURA" = :a8,"END_T>.
2011-12-02 09:09:32  WARNING OGG-01003  Repositioning to rba 375833 in seqno 0.
Source Context :
  SourceModule            : [er.main]
  SourceID                : [/scratch/angorant/view_storage/angorant_ogg_12978807_x64/oggcore/OpenSys/src/app/er/rep.c]
  SourceFunction          : [take_rep_err_action]
  SourceLine              : [16134]
  ThreadBacktrace         : [8] elements
                          : [/opt/OGG/replicat(CMessageContext::AddThreadContext()+0x26) [0x5ef8b6]]
                          : [/opt/OGG/replicat(CMessageFactory::CreateMessage(CSourceContext*, unsigned int, ...)+0x7b2) [0x5e6382]]
                          : [/opt/OGG/replicat(_MSG_ERR_MAP_TO_TANDEM_FAILED(CSourceContext*, DBString<777> const&, DBString<777> const&, CMessageFactory::Me
ssageDisposition)+0x9b) [0x5c4bcb]]
                          : [/opt/OGG/replicat [0x81ac2f]]
                          : [/opt/OGG/replicat [0x8f73e2]]
                          : [/opt/OGG/replicat(main+0x84b) [0x50764b]]
                          : [/lib64/libc.so.6(__libc_start_main+0xf4) [0x3e2f41d994]]
                          : [/opt/OGG/replicat(__gxx_personality_v0+0x1da) [0x4e3c2a]]
2011-12-02 09:09:32  ERROR   OGG-01296  Error mapping from CSCNEW.TAB_CS_CALL_PICKUP to SRT_CREEN.TAB_CS_CALL_PICKUP.
***********************************************************************
*                   ** Run Time Statistics **                         *
***********************************************************************
Last record for the last committed transaction is the following:
___________________________________________________________________
Trail name :  /opt/OGG/dirdat/U3000000
Hdr-Ind    :     E  (x45)     Partition  :     .  (x04)
UndoFlag   :     .  (x00)     BeforeAfter:     A  (x41)
RecLength  :   157 (x009d)    IO Time    : 2011-12-02 11:52:49.559112
IOType     :    15  (x0f)     OrigNode   :   255  (xff)
TransInd   :     .  (x03)     FormatType :     R  (x52)
SyskeyLen  :     0  (x00)     Incomplete :     .  (x00)
AuditRBA   :       3233       AuditPos   : 8194064
Continued  :     N  (x00)     RecCount   :     1  (x01)
2011-12-02 11:52:49.559112 FieldComp          Len   157 RBA 375833
Name: CSCNEW.TAB_CS_CALL_PICKUP
___________________________________________________________________
Reading /opt/OGG/dirdat/U3000000, current RBA 375833, 0 records
Report at 2011-12-02 09:09:32 (activity since 2011-12-02 09:09:32)
From Table CSCNEW.TAB_CS_CALL_PICKUP to SRT_CREEN.TAB_CS_CALL_PICKUP:
       #                   inserts:         0
       #                   updates:         0
       #                   deletes:         0
       #                  discards:         1
DDL replication statistics:
                    Operations:         0
             Mapped operations:         0
           Unmapped operations:         0
              Other operations:         0
           Excluded operations:         0
                        Errors:         0
                Retried errors:         0
              Discarded errors:         0
                Ignored errors:         0
Last log location read:
     FILE:      /opt/OGG/dirdat/U3000000
     SEQNO:     0
     RBA:       375833
     TIMESTAMP: 2011-12-02 11:52:49.559112
     EOF:       NO
     READERR:   0
2011-12-02 09:09:32  ERROR   OGG-01668  PROCESS ABENDING.
--这里可以看出我重设的当前的时间点对应的io time为2011-12-02 11:52:49.559112,而现在尚未到这个时间,进程启动失败
GGSCI (srtcreen) 2> alter r_3,begin 2011-12-02 18:00:00
REPLICAT altered.
--重设时间到今天早上我重设时间之前的错误时间之后
GGSCI (srtcreen) 3> start r_3
Sending START request to MANAGER ...
REPLICAT R_3 starting
GGSCI (srtcreen) 4> info all
Program     Status      Group       Lag           Time Since Chkpt
MANAGER     RUNNING
REPLICAT    RUNNING     R_1         00:00:00      00:00:06
REPLICAT    RUNNING     R_2         00:00:00      00:00:01
REPLICAT    RUNNING     R_3         unknown       00:00:08
REPLICAT    RUNNING     R_4         00:00:00      00:00:09
REPLICAT    RUNNING     R_5         00:00:00      00:00:09
GGSCI (srtcreen) 5> info all
Program     Status      Group       Lag           Time Since Chkpt
MANAGER     RUNNING
REPLICAT    RUNNING     R_1         00:00:00      00:00:09
REPLICAT    RUNNING     R_2         00:00:00      00:00:03
REPLICAT    RUNNING     R_3         00:00:00      00:00:00
REPLICAT    RUNNING     R_4         00:00:00      00:00:02
REPLICAT    RUNNING     R_5         00:00:00      00:00:01
GGSCI (srtcreen) 6> stats r_3
Sending STATS request to REPLICAT R_3 ...
No active replication maps
DDL replication statistics:
*** Total statistics since replicat started     ***
        Operations                                   0.00
        Mapped operations                            0.00
        Unmapped operations                          0.00
        Other operations                             0.00
        Excluded operations                          0.00
        Errors                                       0.00
        Retried errors                               0.00
        Discarded errors                             0.00
        Ignored errors                               0.00
--进程启动,工作正常

三、补充说明
1、部署ogg前需要核对两边时间是否相同,为了避免不必要的麻烦,建议配置ntp同步时间
2、这里因为我们这要求该同步上线无严格时间要求,所以通过重设r_3时间点实现工作正常,如果事情比较紧急,建议删除trail文件,重建pump和replicat进程
3、这里说明ogg同步也还不是和时间一点关系都没有

Mysql查询视图:ERROR 1449 (HY000)

1、问题重现
前几天因为有人删除了数据库中的记录,今天关闭了数据库的远程访问功能,今天接到开发报告,说出现 The user specified as a definer (‘air’@’%’) does not exist错误,他们定位是一张视图不能访问。利用实验重现了他们的情况

[root@ECP-UC-DB1 ~]# mysql -uxff -pxifenfei
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8846
Server version: 5.5.14-log MySQL Community Server (GPL)
Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> select user,host from mysql.user;
+------+---------------+
| user | host          |
+------+---------------+
| xff  | %             |
| root | 127.0.0.1     |
| repl | 192.168.11.10 |
| root | ::1           |
|      | ECP-UC-DB1    |
| root | ECP-UC-DB1    |
| root | localhost     |
+------+---------------+
7 rows in set (0.08 sec)
mysql> use xifenfei;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> create view v_users as select * from wp_users;
Query OK, 0 rows affected (0.14 sec)
mysql> select count(*) from xifenfei.v_users;
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0.03 sec)
mysql> update mysql.user set host='localhost' where user='xff' and host='%';
Query OK, 1 row affected (0.05 sec)
Rows matched: 1  Changed: 1  Warnings: 0
mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.12 sec)
mysql> exit
Bye
[root@ECP-UC-DB1 ~]# mysql -uxff -pxifenfei
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8847
Server version: 5.5.14-log MySQL Community Server (GPL)
Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> use xff;
ERROR 1049 (42000): Unknown database 'xff'
mysql> use xifenfei;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from v_users ;
ERROR 1449 (HY000): The user specified as a definer ('xff'@'%') does not exist

2、解决方法

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
| xifenfei           |
+--------------------+
5 rows in set (0.00 sec)
mysql> use information_schema;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> desc VIEWS;
+----------------------+--------------+------+-----+---------+-------+
| Field                | Type         | Null | Key | Default | Extra |
+----------------------+--------------+------+-----+---------+-------+
| TABLE_CATALOG        | varchar(512) | NO   |     |         |       |
| TABLE_SCHEMA         | varchar(64)  | NO   |     |         |       |
| TABLE_NAME           | varchar(64)  | NO   |     |         |       |
| VIEW_DEFINITION      | longtext     | NO   |     | NULL    |       |
| CHECK_OPTION         | varchar(8)   | NO   |     |         |       |
| IS_UPDATABLE         | varchar(3)   | NO   |     |         |       |
| DEFINER              | varchar(77)  | NO   |     |         |       |
| SECURITY_TYPE        | varchar(7)   | NO   |     |         |       |
| CHARACTER_SET_CLIENT | varchar(32)  | NO   |     |         |       |
| COLLATION_CONNECTION | varchar(32)  | NO   |     |         |       |
+----------------------+--------------+------+-----+---------+-------+
10 rows in set (0.02 sec)
mysql> select TABLE_SCHEMA,TABLE_NAME,DEFINER from views;
+--------------+------------+---------+
| TABLE_SCHEMA | TABLE_NAME | DEFINER |
+--------------+------------+---------+
| xifenfei     | v_users    | xff@%   |
+--------------+------------+---------+
1 row in set (0.16 sec)
mysql> create or replace view v_users as select * from wp_users;
ERROR 1044 (42000): Access denied for user 'xff'@'localhost' to database 'information_schema'
mysql> create or replace view xifenfei.v_users as select * from xifenfei.wp_users;
Query OK, 0 rows affected (0.02 sec)
mysql> select TABLE_SCHEMA,TABLE_NAME,DEFINER from views;
+--------------+------------+---------------+
| TABLE_SCHEMA | TABLE_NAME | DEFINER       |
+--------------+------------+---------------+
| xifenfei     | v_users    | xff@localhost |
+--------------+------------+---------------+
1 row in set (0.01 sec)
mysql> select count(*) from xifenfei.v_users;
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0.03 sec)

3、原因分析
因为创建视图使用的是xff@%用户(目前已经不存在),然后登录用户使用的是xff@localhost用户,导致mysql认为现在的用户无权限访问该视图,解决方法就是在当前用户下重建该视图

Data pump 中network_link参数的使用续(登录用户和源端用户不一致处理)

1、准备工作

[oracle@node1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Nov 30 09:08:04 2011
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
SQL> create user test identified by xifenfei;
User created.
SQL> grant connect,resource to test;
Grant succeeded.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options

2、远程导出数据

[oracle@node1 ~]$  expdp test/xifenfei directory=test_dir dumpfile=t1.dmp network_link=dblink_test tables=chf.T2_1
Export: Release 11.2.0.3.0 - Production on Wed Nov 30 09:09:35 2011
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORA-39002: invalid operation
ORA-39070: Unable to open the log file.
ORA-39087: directory name TEST_DIR is invalid
--对目录需要相关权限
[oracle@node1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Nov 30 09:09:59 2011
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
SQL> grant read,write on directory test_dir to test;
Grant succeeded.
SQL>  expdp test/xifenfei directory=test_dir dumpfile=t1.dmp network_link=dblink_test tables=chf.T2_1
SP2-0734: unknown command beginning "expdp test..." - rest of line ignored.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
[oracle@node1 ~]$  expdp test/xifenfei directory=test_dir dumpfile=t1.dmp network_link=dblink_test tables=chf.T2_1
Export: Release 11.2.0.3.0 - Production on Wed Nov 30 09:10:23 2011
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORA-31631: privileges are required
ORA-39109: Unprivileged users may not operate upon other users' schemas
--执行导出用户需要相关权限exp_full_database
[oracle@node1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Nov 30 09:11:54 2011
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
SQL> grant exp_full_database  to test;
Grant succeeded.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
[oracle@node1 ~]$  expdp test/xifenfei directory=test_dir dumpfile=t1.dmp network_link=dblink_test tables=chf.T2_1
Export: Release 11.2.0.3.0 - Production on Wed Nov 30 09:14:29 2011
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
Starting "TEST"."SYS_EXPORT_TABLE_01":  test/******** directory=test_dir dumpfile=t1.dmp network_link=dblink_test tables=chf.T2_1
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 22 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
. . exported "CHF"."T2_1"                                9.326 MB  100000 rows
Master table "TEST"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for TEST.SYS_EXPORT_TABLE_01 is:
  /tmp/t1.dmp
Job "TEST"."SYS_EXPORT_TABLE_01" successfully completed at 09:14:35
--实现用test登录,导出源端chf用户下面的表
[oracle@node1 ~]$ rm /tmp/t1.dmp
[oracle@node1 ~]$  expdp test/xifenfei directory=test_dir dumpfile=t1.dmp network_link=dblink_test tables=T2_1
Export: Release 11.2.0.3.0 - Production on Wed Nov 30 09:14:57 2011
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORA-39001: invalid argument value
ORA-39195: At least one schema in the TABLE_FILTER does not exist.
--如果当前登录用户和表所属用户不同,需要指定schema的名称

3、直接导入数据到目标端

[oracle@node1 ~]$ impdp test/xifenfei directory=test_dir network_link=dblink_test REMAP_SCHEMA=chf:xff REMAP_TABLESPACE=odu:users tables=chf.t2_1
Import: Release 11.2.0.3.0 - Production on Wed Nov 30 09:45:38 2011
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORA-31631: privileges are required
ORA-39109: Unprivileged users may not operate upon other users' schemas
--登录用户需要imp_full_database权限
[oracle@node1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Nov 30 09:46:41 2011
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
SQL> grant imp_full_database to test;
Grant succeeded.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
[oracle@node1 ~]$ impdp test/xifenfei directory=test_dir network_link=dblink_test REMAP_SCHEMA=chf:xff REMAP_TABLESPACE=odu:users tables=chf.t2_1
Import: Release 11.2.0.3.0 - Production on Wed Nov 30 09:46:59 2011
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
Starting "TEST"."SYS_IMPORT_TABLE_01":  test/******** directory=test_dir network_link=dblink_test REMAP_SCHEMA=chf:xff REMAP_TABLESPACE=odu:users tables=chf.t2_1
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 22 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . imported "XFF"."T2_1"                                100000 rows
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Job "TEST"."SYS_IMPORT_TABLE_01" successfully completed at 09:47:07
--使用test用户登录,目标端用户为xff,源端用户为chf
[oracle@node1 ~]$ impdp test/xifenfei directory=test_dir network_link=dblink_test REMAP_SCHEMA=chf:xff REMAP_TABLESPACE=odu:users tables=t2_1
Import: Release 11.2.0.3.0 - Production on Wed Nov 30 09:52:54 2011
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORA-39001: invalid argument value
ORA-39195: At least one schema in the TABLE_FILTER does not exist.
--不同用户时,需要指定源端表所属用户

Data pump 中network_link参数的使用

一、准备工作
1、源端

[oracle@ECP-UC-DB1 ~]$ sqlplus  chf/xifenfei
SQL*Plus: Release 10.2.0.4.0 - Production on Tue Nov 29 15:07:35 2011
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL>  select count(*) from T2_1 ;
  COUNT(*)
----------
    100000

2、目标端

--tns配置
test =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.11.12)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = test)
    )
  )
[oracle@node1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Tue Nov 29 14:57:08 2011
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
SQL> create public database link dblink_test
  2  connect to chf identified by xifenfei using 'test';
Database link created.
SQL> grant read,write on directory test_dir to xff;
Grant succeeded.

二、导入方式
1、expdp导出,impdp导入

[oracle@node1 ~]$ expdp chf/xff directory=test_dir dumpfile=t1_2.dmp network_link=dblink_test tables=T2_1
Export: Release 11.2.0.3.0 - Production on Tue Nov 29 15:05:36 2011
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
Starting "CHF"."SYS_EXPORT_TABLE_01":  chf/******** directory=test_dir dumpfile=t1_2.dmp network_link=dblink_test tables=T2_1
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 22 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
. . exported "CHF"."T2_1"                                9.326 MB  100000 rows
Master table "CHF"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for CHF.SYS_EXPORT_TABLE_01 is:
  /tmp/t1_2.dmp
Job "CHF"."SYS_EXPORT_TABLE_01" successfully completed at 15:06:52
--1、userid使用的是目标端(expdp端)登录
--2、tables对应的表所有者需要和userid相同
[oracle@node1 tmp]$ ll /tmp/t1_2.dmp
-rw-r----- 1 oracle oinstall 9859072 11-29 15:06 /tmp/t1_2.dmp
[oracle@node1 tmp]$ impdp xff/xifenfei directory=test_dir dumpfile=t1_2.dmp REMAP_SCHEMA=chf:xff REMAP_TABLESPACE=odu:users
Import: Release 11.2.0.3.0 - Production on Tue Nov 29 15:30:15 2011
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
Master table "XFF"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
Starting "XFF"."SYS_IMPORT_FULL_01":  xff/******** directory=test_dir dumpfile=t1_2.dmp REMAP_SCHEMA=chf:xff REMAP_TABLESPACE=odu:users
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "XFF"."T2_1"                                9.326 MB  100000 rows
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Job "XFF"."SYS_IMPORT_FULL_01" successfully completed at 15:30:18
[oracle@node1 tmp]$ sqlplus xff/xifenfei
SQL*Plus: Release 11.2.0.3.0 Production on Tue Nov 29 15:30:43 2011
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
SQL> select count(*) from t2_1;
  COUNT(*)
----------
    100000

2、使用impdp直接导入

[oracle@node1 tmp]$ impdp chf/xff directory=test_dir network_link=dblink_test REMAP_SCHEMA=chf:xff REMAP_TABLESPACE=odu:users tables=t2_1
Import: Release 11.2.0.3.0 - Production on Tue Nov 29 15:48:49 2011
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
Starting "CHF"."SYS_IMPORT_TABLE_01":  chf/******** directory=test_dir network_link=dblink_test REMAP_SCHEMA=chf:xff REMAP_TABLESPACE=odu:users tables=t2_1
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 22 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . imported "XFF"."T2_1"                                100000 rows
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Job "CHF"."SYS_IMPORT_TABLE_01" successfully completed at 15:49:00
--1、userid使用的是impdp端登录
--2、tables对应的表所有者需要和userid相同(tables的表所有者是源端,使用REMAP_SCHEMA映射owner)
[oracle@node1 tmp]$ sqlplus xff/xifenfei
SQL*Plus: Release 11.2.0.3.0 Production on Tue Nov 29 15:51:18 2011
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
SQL> select count(*) from t2_1;
  COUNT(*)
----------
    100000

如果登录用户和需要导入的表用户名不同,暂不清楚怎么处理

Mysql误删除数据及其bug分析

一、现状描述
11月25日晚上8点40分接到现场电话,说我们公司所有员工的im不能正常登陆,im数据库服务器(mysql数据库)的公司表中对应的我们公司名称被删除,需要处理。接到这个异常后,第一想到的是下午下班前,收到一封ogg的警告邮件说ogg进程终止,然后我登陆数据库查看的时候,发现ogg已经工作正常,所以也就没有太多关注。既然已经出现了这个问题,那么先解决问题,再分析原因。因为这些都是ogg从oracle端同步过来的一些数据,所以直接从oracle那边初始化一份过来,然后重设同步程序就可以了。
二、错误分析
1、出现这个问题,第一想到的就是binlog,因为公司表的数据是从我们oracle那边同步过去的,而且oracle那边没有任何关于这个表的删除操作,所以我定位这个表的delete操作

[mysql@ezgclient mysqllog]$ mysqlbinlog mysqlbin.000150>/tmp/11_25.txt
[mysql@ezgclient mysqllog]$ vi /tmp/11_25.txt
……
# at 1396789
# at 1396875
#111125 16:15:31 server id 2  end_log_pos 1396875       Table_map: `a`.`abc` mapped to number 5304
#111125 16:15:31 server id 2  end_log_pos 1397000       Delete_rows: table id 5304 flags: STMT_END_F
BINLOG '
o07PThMCAAAAVgAAAItQFQAAALgUAAAAAAAABnNydGFpcgAPdGFiX3V1bV9jb21wYW55AA4DD/b2
CAwMCA8PDw8PCBAAARQAFgCAAIAAQABAAEAA+D8=
o07PThkCAAAAfQAAAAhRFQAQALgUAAAAAAEADv//AP7iAAAAFAC6vNbdysC1vL/GvLzT0M/euavL
voAAAAAAOLkeQYAAAAAAAAAAAAEpAAAAAAAAANkqumA8EgAAO8kaakoSAAABAAAAAAAAAA9CVTAw
MDAwMDAzNTgwNzg=
……

2、因为binlog_format采用的是row模式,所以需要进一步解析binglog

[mysql@ezgclient mysqllog]$ mysqlbinlog -v -v mysqlbin.000150>/tmp/11_25.txt
[mysql@ezgclient mysqllog]$ vi /tmp/11_25.txt
#111125 16:15:31 server id 2  end_log_pos 1396875   Table_map: `a`.`abc` mapped to number 5304
#111125 16:15:31 server id 2  end_log_pos 1397000   Delete_rows: table id 5304 flags: STMT_END_F
BINLOG '
o07PThMCAAAAVgAAAItQFQAAALgUAAAAAAAABnNydGFpcgAPdGFiX3V1bV9jb21wYW55AA4DD/b2
CAwMCA8PDw8PCBAAARQAFgCAAIAAQABAAEAA+D8=
o07PThkCAAAAfQAAAAhRFQAQALgUAAAAAAEADv//AP7iAAAAFAC6vNbdysC1vL/GvLzT0M/euavL
voAAAAAAOLkeQYAAAAAAAAAAAAEpAAAAAAAAANkqumA8EgAAO8kaakoSAAABAAAAAAAAAA9CVTAw
MDAwMDAzNTgwNzg=
'/*!*/;
### DELETE FROM a.abc
### WHERE
###   @1=226 /* INT meta=0 nullable=0 is_null=0 */
###   @2='*****有限公司' /* VARSTRING(256) meta=256 nullable=0 is_null=0 */
###   @3=951656001 /* DECIMAL(20,0) meta=5120 nullable=0 is_null=0 */
###   @4=000000001 /* DECIMAL(22,0) meta=5632 nullable=1 is_null=0 */
###   @5=41 /* LONGINT meta=0 nullable=1 is_null=0 */
###   @6=2005-05-30 15:11:29 /* DATETIME meta=0 nullable=1 is_null=0 */
###   @7=2011-08-17 02:02:19 /* DATETIME meta=0 nullable=1 is_null=0 */
###   @8=1 /* LONGINT meta=0 nullable=1 is_null=0 */
###   @9='BU0000000358078' /* VARSTRING(128) meta=128 nullable=1 is_null=0 */
###   @10=NULL /* VARSTRING(128) meta=128 nullable=1 is_null=1 */
###   @11=NULL /* VARSTRING(128) meta=64 nullable=1 is_null=1 */
###   @12=NULL /* VARSTRING(128) meta=64 nullable=1 is_null=1 */
###   @13=NULL /* VARSTRING(128) meta=64 nullable=1 is_null=1 */
###   @14=NULL /* VARSTRING(128) meta=0 nullable=1 is_null=1 */
# at 1397000
#111125 16:15:31 server id 2  end_log_pos 1397027   Xid = 79238866
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;

通过上面的分析和这里的日志情况显示,很明显有人误删除了这条记录,导致我们公司所有员工不能登录im(登录在线的,不会使用到这条记录,这个也就是导致了我们到晚上八点多才发现这个异常)

3、检查error日志
检查这个日志,发现一个很明显的bug,这个是导致数据库重启,以及那个时间因为数据库重启导致ogg进程异常收到邮件

111125 16:15:35 InnoDB: Assertion failure in thread 1095162176 in file row/row0mysql.c line 1534
InnoDB: Failing assertion: index->type & DICT_CLUSTERED
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/forcing-recovery.html
InnoDB: about forcing recovery.
111125 16:15:35 - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_size=8388608
read_buffer_size=1048576
max_used_connections=30
max_threads=1000
threads_connected=14
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 9234379 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd: 0xb8bf170
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x4146d100 thread_stack 0x30000
/opt/mysql/product/5.1/bin/mysqld(my_print_stacktrace+0x2e)[0x8a74ce]
/opt/mysql/product/5.1/bin/mysqld(handle_segfault+0x322)[0x5dc992]
/lib64/libpthread.so.0[0x357980eb10]
/lib64/libc.so.6(gsignal+0x35)[0x3578c30265]
/lib64/libc.so.6(abort+0x110)[0x3578c31d10]
/opt/mysql/product/5.1/bin/mysqld(row_unlock_for_mysql+0x2f2)[0x7f4a52]
/opt/mysql/product/5.1/bin/mysqld(row_search_for_mysql+0x22e1)[0x802591]
/opt/mysql/product/5.1/bin/mysqld(_ZN11ha_innobase10index_readEPhPKhj16ha_rkey_function+0x192)[0x7724d2]
/opt/mysql/product/5.1/bin/mysqld(_ZN7handler16read_range_firstEPK12st_key_rangeS2_bb+0xbe)[0x6caa9e]
/opt/mysql/product/5.1/bin/mysqld(_ZN7handler22read_multi_range_firstEPP18st_key_multi_rangeS1_jbP17st_handler_buffer+0xce)[0x6c85be]
/opt/mysql/product/5.1/bin/mysqld(_ZN18QUICK_RANGE_SELECT8get_nextEv+0x127)[0x6aa557]
/opt/mysql/product/5.1/bin/mysqld[0x6c415d]
/opt/mysql/product/5.1/bin/mysqld(_Z12mysql_deleteP3THDP10TABLE_LISTP4ItemP11st_sql_listyyb+0x86c)[0x66fc5c]
/opt/mysql/product/5.1/bin/mysqld(_Z21mysql_execute_commandP3THD+0x38bf)[0x5f03af]
/opt/mysql/product/5.1/bin/mysqld(_Z11mysql_parseP3THDPKcjPS2_+0x357)[0x5f25e7]
/opt/mysql/product/5.1/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xe63)[0x5f3453]
/opt/mysql/product/5.1/bin/mysqld(_Z10do_commandP3THD+0xe6)[0x5f3d16]
/opt/mysql/product/5.1/bin/mysqld(handle_one_connection+0x236)[0x5e66d6]
/lib64/libpthread.so.0[0x357980673d]
/lib64/libc.so.6(clone+0x6d)[0x3578cd3d1d]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x2aab4890fcd0 is an invalid pointer
thd->thread_id=62259
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
111125 16:15:35 mysqld_safe Number of processes running now: 0
111125 16:15:35 mysqld_safe mysqld restarted
InnoDB: Log scan progressed past the checkpoint lsn 0 694228728
111125 16:15:36 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 0 694229872
111125 16:15:36 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 5
6 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 1397027, file name /opt/mysql/mysqldata/mysqllog/mysqlbin.000149
111125 16:15:37 InnoDB: Started; log sequence number 0 694229872
111125 16:15:37 [Note] Recovering after a crash using /opt/mysql/mysqldata/mysqllog/mysqlbin
111125 16:15:37 [Note] Starting crash recovery...
111125 16:15:37 [Note] Crash recovery finished.

而且还有个巧合就是查看binlog发现,DELETE FROM a.abc where ……之后,数据库就因为这个bug自动重启了。

4、网友解释

# /opt/mysql/product/5.1/bin/mysqld(_Z12mysql_deleteP3THDP10TABLE_LISTP4ItemP11st_sql_listyyb+0x86c)[0x66fc5c]
# /opt/mysql/product/5.1/bin/mysqld(_Z21mysql_execute_commandP3THD+0x38bf)[0x5f03af]
---从这信息看是整理簇索引,导致表空间出现损坏
---分析的情况,你的系统应该正在做一个DELETE操作,而且应该无索引可走,删除的数据量也比较大
---可能是大量数据被缓存在innodb_buffer_pool_size中,并且其内部有创建自适应的hash索引,因删除数据而不得不重新创建,
---以及你的服务器当时IO出现瓶颈,导致一时无法响应Innodb master thread,而出现问题,并且InnoDB引擎在此方面出现过BUG
---解决版本是5.1.37之后,所以建议使用:5.1.40版本,较稳定

感谢jinguanding前辈热情帮助
http://www.itpub.net/forum.php?mod=viewthread&tid=1515971&page=1#pid18593129

ERROR: Error in Log_event::read_log_event(): 'Found invalid event in binary log'

今天晚上接到现场电话,说mysql数据库中异常丢失数据,我登陆系统使用mysqlbinlog进行分析日志

[mysql@ezgclient mysqldata]$ mysqlbinlog mysqlbin.000149>/tmp/11_25.txt
ERROR: Error in Log_event::read_log_event(): 'Found invalid event in binary log', data_len: 58, event_type: 19

发现这个错误,第一反应是我使用的mysqlbinlog的版本不正确

[mysql@ezgclient mysqldata]$ whereis  mysqlbinlog
mysqlbinlog: /usr/bin/mysqlbinlog
[mysql@ezgclient mysqldata]$ /usr/bin/mysqlbinlog -V
/usr/bin/mysqlbinlog Ver 3.2 for redhat-linux-gnu at x86_64
[mysql@ezgclient mysqldata]$ /opt/mysql/product/5.1/bin/mysqlbinlog -V
/opt/mysql/product/5.1/bin/mysqlbinlog Ver 3.3 for unknown-linux-gnu at x86_64

通过查询果然发现在默认情况下调用的是系统默认安装的mysql中的mysqlbinlog,因为这个mysqlbinlog的版本和当前的bin_log的版本不能对应起来,所以不能处理,使用对应的mysqlbinlog工作正常

[mysql@ezgclient mysqllog]$ /opt/mysql/product/5.1/bin/mysqlbinlog mysqlbin.000150>/tmp/11_25.txt
[mysql@ezgclient mysqllog]$ more /tmp/11_25.txt
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#111125 16:15:37 server id 2  end_log_pos 106   Start: binlog v 4, server v 5.1.35-log created 111125 16:15:37 at startup
# Warning: this binlog was not closed properly. Most probably mysqld crashed writing it.
ROLLBACK/*!*/;
BINLOG '
……………………………………

因域名解析导致数据库连接延迟分析

一、现状记录

[oracle@node1 ~]$ /sbin/ifconfig
eth1      Link encap:Ethernet  HWaddr 00:25:90:04:AB:6B
          inet addr:192.168.9.140  Bcast:192.168.15.255  Mask:255.255.248.0
          inet6 addr: fe80::225:90ff:fe04:ab6b/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:23530402 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10959123 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:15308483748 (14.2 GiB)  TX bytes:10087987532 (9.3 GiB)
--IP地址为192.168.9.140
[oracle@node1 ~]$ more /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               ecp-db localhost.localdomain localhost
192.168.9.140   node1.srtcloud.com
--域名node1.srtcloud.com对应ip192.168.9.140
[oracle@node1 ~]$ lsnrctl status
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=node1.srtcloud.com)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 10.2.0.5.0 - Production
Start Date                04-NOV-2011 09:08:51
Uptime                    21 days 4 hr. 58 min. 45 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /opt/oracle/product/10.2.0/db_1/network/admin/listener.ora
Listener Log File         /opt/oracle/product/10.2.0/db_1/network/log/listener.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=node1.srtcloud.com)(PORT=1521)))
Services Summary...
Service "ecp" has 2 instance(s).
  Instance "ecp", status UNKNOWN, has 1 handler(s) for this service...
  Instance "ecp", status READY, has 1 handler(s) for this service...
Service "ecpXDB" has 1 instance(s).
  Instance "ecp", status READY, has 1 handler(s) for this service...
Service "ecp_XPT" has 1 instance(s).
  Instance "ecp", status READY, has 1 handler(s) for this service...
Service "ora11g" has 2 instance(s).
  Instance "ora11g", status UNKNOWN, has 1 handler(s) for this service...
  Instance "ora11g", status READY, has 1 handler(s) for this service...
Service "ora11gXDB" has 1 instance(s).
  Instance "ora11g", status READY, has 1 handler(s) for this service...
The command completed successfully
--说明:ora11g是oracle 11g,ecp是oracle 10g
--当前使用域名node1.srtcloud.com监听
[oracle@node1 ~]$ more /opt/oracle/product/10.2.0/db_1/network/admin/tnsnames.ora
# tnsnames.ora Network Configuration File: /opt/oracle/product/10.2.0/db_1/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.
ECP =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = node1.srtcloud.com)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ecp)
    )
  )
ORA11G =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = node1.srtcloud.com)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ora11g)
    )
  )
--tns也通过域名访问
[oracle@node1 ~]$ more /etc/resolv.conf
nameserver 211.155.235.201
nameserver 211.155.235.188
--当前有效的dns服务器
[oracle@node1 ~]$ more /etc/nsswitch.conf |grep hosts:
hosts:     files dns
--域名解析顺序

二、数据库正常工作分析
1、tns工作:客户端通过tns访问数据库,tns配置的是域名访问,所以需要解析,因为此刻解析的顺序是先利用/etc/hosts解析,所以读取hosts文件,获取到ip,然后访问对应数据库,和监听接触。
2、监听工作:监听的是域名,其实也是通过hosts解析成ip的
3、这里能够正常的工作,是因为hosts文件解析了域名

三、模拟数据库访问延迟

[oracle@node1 ~]$ more /etc/nsswitch.conf |grep hosts:
hosts:     dns files
--先使用dns服务器解析,再使用hosts文件
[oracle@node1 ~]$ more /etc/resolv.conf
nameserver 11.1.1.1
--无效的dns服务器
[oracle@node1 ~]$ sqlplus chf/xifenfei@ora11g
SQL*Plus: Release 10.2.0.5.0 - Production on Fri Nov 25 14:44:55 2011
Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
--会在这里一个很长的时间等待
[oracle@node1 ~]$ lsnrctl status
LSNRCTL for Linux: Version 10.2.0.5.0 - Production on 25-NOV-2011 14:48:26
Copyright (c) 1991, 2010, Oracle.  All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=node1.srtcloud.com)(PORT=1521)))
--也会一个长时间的等待
--问题原因:就是因为解析域名的时候,先去访问dns服务器,因为该ip非dns服务器ip,所以会一直等待该ip超时,
--然后访问hosts文件获取ip地址(这个就是为什么我们登录或者查看监听状态的时候,会出现如此长的时间的等待)

其实因为dns延迟的现象有很多种,我这里只是举了一个最简单,比较常见的例子,在处理因dns解析的监听延迟的问题上,可以参考下面几点:
1、如果非特殊情况,尽可能使用ip地址在监听和tns中
2、如果是使用域名,请尽可能使用hosts解析,解析顺序配置为files优先(因为dns服务器有很多不确定,不可控因素)
3、如果一定要使用dns服务器解析,请把稳定的dns服务器配置在第一项,尽可能避免出现dns服务器不可达或者不存在该域名的现象

statement suspended, wait error to be cleared

一、现场报告
导入数据到Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX的时候,hang住了,求救

[oracle@TestServer-RHAS-5 dmpdir]$ impdp system/ DIRECTORY=dmpdir DUMPFILE=cscnew.20111123.dmp LOGFILE=cscnew.20111123.log SCHEMAS=CSCNEW remap_schema=CSCNEW:TESTB remap_tablespace=CSC_TAB_1:TESTB table_exists_action=replace
…………
. . imported "TESTB"."TAB_CS_SELF_WORKTIME"                  0 KB       0 rows
. . imported "TESTB"."TAB_CS_SELF_WORKTIME_DETAIL"           0 KB       0 rows
. . imported "TESTB"."TAB_CS_USERMENU"                       0 KB       0 rows
. . imported "TESTB"."TAB_PUB_BANK"                          0 KB       0 rows
. . imported "TESTB"."TAB_PUB_BUSISRVINFO"                   0 KB       0 rows
. . imported "TESTB"."TAB_PUB_CONTACT"                       0 KB       0 rows
Processing object type SCHEMA_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX

二、处理过程
1、分析是否是impdp是否因为网络等情况终止掉

[oracle@TestServer-RHAS-5 ~]$ ps -ef|grep impdp
oracle    2520  1837  0 09:59 pts/8    00:00:00 grep impdp
oracle   23819 20966  0 09:39 pts/6    00:00:00 impdp         DIRECTORY=dmpdir DUMPFILE=cscnew.20111123.dmp LOGFILE=cscnew.20111123.log SCHEMAS=CSCNEW remap_schema=CSCNEW:TESTB remap_tablespace=CSC_TAB_1:TESTB table_exists_action=replace
[oracle@TestServer-RHAS-5 ~]$ ps -ef|grep LOCAL=YES
oracle    2692  1837  0 10:00 pts/8    00:00:00 grep LOCAL=YES
oracle   10754 10694  0 09:15 ?        00:00:09 oraclemcrm (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle   23835 23819  0 09:40 ?        00:00:00 oraclemcrm (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

通过上面的查询,证明impdp进程工作正常

2、查询等待事件

[oracle@TestServer-RHAS-5 ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Thu Nov 24 10:00:26 2011
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select event from v$session_wait where wait_class#<>6;
EVENT
----------------------------------------------------------------
SQL*Net message to client
statement suspended, wait error to be cleared

通过这个查询,发现一个异常等待事件:statement suspended, wait error to be cleared。
查询MOS,确定是表空间不足引起impdp suspended
Statement Suspended, Wait Error To Be Cleared Wait Event [ID 761848.1]

Oracle Database provides a means for suspending, and later resuming,
the execution of large database operations in the event of space allocation failures.
This enables you to take corrective action instead of the Oracle Database server returning an error to the user.
After the error condition is corrected, the suspended operation automatically resumes.
This feature is called resumable space allocation. The statements that are affected are called resumable statements.
The time between suspending the execution till correction of the error is reported as
"statement suspended, wait error to be cleared" wait event.

3、查看alert.log日志文件确认

[oracle@TestServer-RHAS-5 ~]$ cd /opt/oracle/admin/mcrm/bdump/
[oracle@TestServer-RHAS-5 bdump]$ tail -30 alert_mcrm.log
Thu Nov 24 09:29:20 2011
create tablespace testb
datafile '/opt/oradata/mcrm/testb.dbf'
size 1500M autoextend on next 50M maxsize 2000M
Thu Nov 24 09:29:51 2011
Completed: create tablespace testb
datafile '/opt/oradata/mcrm/testb.dbf'
size 1500M autoextend on next 50M maxsize 2000M
Thu Nov 24 09:40:00 2011
The value (30) of MAXTRANS parameter ignored.
kupprdp: master process DM00 started with pid=111, OS id=23858
         to execute - SYS.KUPM$MCP.MAIN('SYS_IMPORT_SCHEMA_01', 'SYSTEM', 'KUPC$C_1_20111124094000', 'KUPC$S_1_20111124094000', 0);
kupprdp: worker process DW01 started with worker id=1, pid=112, OS id=23870
         to execute - SYS.KUPW$WORKER.MAIN('SYS_IMPORT_SCHEMA_01', 'SYSTEM');
Thu Nov 24 09:43:11 2011
statement in resumable session 'SYSTEM.SYS_IMPORT_SCHEMA_01.1' was suspended due to
    ORA-01652: unable to extend temp segment by 128 in tablespace TESTB
Thu Nov 24 10:00:45 2011
Thread 1 advanced to log sequence 4761 (LGWR switch)
  Current log# 3 seq# 4761 mem# 0: /opt/oradata/mcrm/redo03.log

4、查询TESTB表空间使用情况

SQL> select bytes/1024/1024,maxbytes/1024/1024,user_bytes/1024/1024
  2  from dba_data_files where tablespace_name='TESTB';
BYTES/1024/1024 MAXBYTES/1024/1024 USER_BYTES/1024/1024
--------------- ------------------ --------------------
           2000               2000            1998.9375

5、解决问题

Thu Nov 24 10:04:21 2011
alter tablespace TESTB add datafile '/opt/oradata/mcrm/testb01.dbf' size 100m  autoextend on next 1m maxsize 30g
Thu Nov 24 10:04:25 2011
Completed: alter tablespace TESTB add datafile '/opt/oradata/mcrm/testb01.dbf' size 100m  autoextend on next 1m maxsize 30g
Thu Nov 24 10:04:26 2011
statement in resumable session 'SYSTEM.SYS_IMPORT_SCHEMA_01.1' was resumed

通过这个日志可以看出,表空间不足的问题解决后(可以添加数据文件,或者resize数据文件大小),impdp的job又开始运行

ORA-09968, ORA-01102 When Starting a Database

一、网友错误
pub上网友遇到一个问题

Tue Nov 22 10:31:19 2011
ALTER DATABASE   MOUNT
Tue Nov 22 10:31:19 2011
sculkget: failed to lock /u01/app/oracle/product/10.2.01/db_1/dbs/lkORCL exclusive
sculkget: lock held by PID: 26308
Tue Nov 22 10:31:19 2011
ORA-09968: unable to lock file
Linux Error: 11: Resource temporarily unavailable
Additional information: 26308
Tue Nov 22 10:31:19 2011
ORA-1102 signalled during: ALTER DATABASE   MOUNT...

我给的建议是重启数据库解决,其实重启数据库是关闭了当前开启的实例,然后开启报错的实例,所以重启成功

二、错误重现

Tue Nov 22 10:31:19 2011
ALTER DATABASE   MOUNT
Tue Nov 22 10:31:19 2011
sculkget: failed to lock /u01/app/oracle/product/10.2.01/db_1/dbs/lkORCL exclusive
sculkget: lock held by PID: 26308
Tue Nov 22 10:31:19 2011
ORA-09968: unable to lock file
Linux Error: 11: Resource temporarily unavailable
Additional information: 26308
Tue Nov 22 10:31:19 2011
ORA-1102 signalled during: ALTER DATABASE   MOUNT...
[oracle@ECP-UC-DB1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Wed Nov 23 09:07:21 2011
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select status from v$instance;
STATUS
------------
OPEN
SQL> show parameter name ;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_file_name_convert                 string
db_name                              string      test
db_unique_name                       string      test
global_names                         boolean     FALSE
instance_name                        string      test
lock_name_space                      string
log_file_name_convert                string
service_names                        string      test
SQL> show parameter control;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
control_file_record_keep_time        integer     7
control_files                        string      /opt/oracle/oradata/test/contr
                                                 ol01.ctl
SQL> create pfile='/tmp/t_pfile' from spfile;
File created.
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@ECP-UC-DB1 ~]$ vi /tmp/t_pfile
*.__db_cache_size=67108864
*.__java_pool_size=4194304
*.__large_pool_size=4194304
*.__shared_pool_size=117440512
*.__streams_pool_size=8388608
*.archive_lag_target=0
*.audit_file_dest='/opt/oracle/admin/test/adump'
*.background_dump_dest='/opt/oracle/admin/test/bdump'
*.compatible='10.2.0.3.0'
*.control_files='/opt/oracle/oradata/test/control01.ctl'
*.core_dump_dest='/opt/oracle/admin/test/cdump'
*.db_block_size=8192
*.db_domain=''
*.db_file_multiblock_read_count=16
*.db_name='test'
*.db_recovery_file_dest='/opt/oracle/flash_recovery_area'
*.db_recovery_file_dest_size=2147483648
*.dispatchers='(PROTOCOL=TCP) (SERVICE=testXDB)'
*.job_queue_processes=10
*.log_archive_dest_1='location=/opt/oracle/oradata/test/archivelog'
*.open_cursors=1000
*.pga_aggregate_target=66060288
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.sga_target=209715200
*.undo_management='AUTO'
*.undo_tablespace='UNDOTBS1'
*.user_dump_dest='/opt/oracle/admin/test/udump'
~
~
"/tmp/t_pfile" 28L, 1043C written
[oracle@ECP-UC-DB1 ~]$ export ORACLE_SID=tt1
[oracle@ECP-UC-DB1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Wed Nov 23 09:10:47 2011
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to an idle instance.
SQL> startup pfile='/tmp/t_pfile' mount;
ORACLE instance started.
Total System Global Area  209715200 bytes
Fixed Size                  2082784 bytes
Variable Size             134219808 bytes
Database Buffers           67108864 bytes
Redo Buffers                6303744 bytes
ORA-01102: cannot mount database in EXCLUSIVE mode
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@ECP-UC-DB1 ~]$ more /opt/oracle/admin/test/bdump/alert_tt1.log
Wed Nov 23 09:11:26 2011
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Shared memory segment for instance monitoring created
Picked latch-free SCN scheme 3
Autotune of undo retention is turned on.
IMODE=BR
ILAT =18
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.4.0.
System parameters with non-default values:
  processes                = 150
  __shared_pool_size       = 117440512
  __large_pool_size        = 4194304
  __java_pool_size         = 4194304
  __streams_pool_size      = 8388608
  sga_target               = 209715200
  control_files            = /opt/oracle/oradata/test/control01.ctl
  db_block_size            = 8192
  __db_cache_size          = 67108864
  compatible               = 10.2.0.3.0
  log_archive_dest_1       = location=/opt/oracle/oradata/test/archivelog
  archive_lag_target       = 0
  db_file_multiblock_read_count= 16
  db_recovery_file_dest    = /opt/oracle/flash_recovery_area
  db_recovery_file_dest_size= 2147483648
  undo_management          = AUTO
  undo_tablespace          = UNDOTBS1
  remote_login_passwordfile= EXCLUSIVE
  db_domain                =
  dispatchers              = (PROTOCOL=TCP) (SERVICE=testXDB)
  job_queue_processes      = 10
  background_dump_dest     = /opt/oracle/admin/test/bdump
  user_dump_dest           = /opt/oracle/admin/test/udump
  core_dump_dest           = /opt/oracle/admin/test/cdump
  audit_file_dest          = /opt/oracle/admin/test/adump
  db_name                  = test
  open_cursors             = 1000
  pga_aggregate_target     = 66060288
PMON started with pid=2, OS id=28086
PSP0 started with pid=3, OS id=28088
MMAN started with pid=4, OS id=28090
DBW0 started with pid=5, OS id=28092
LGWR started with pid=6, OS id=28094
CKPT started with pid=7, OS id=28096
SMON started with pid=8, OS id=28098
RECO started with pid=9, OS id=28100
CJQ0 started with pid=10, OS id=28102
MMON started with pid=11, OS id=28104
Wed Nov 23 09:11:28 2011
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
MMNL started with pid=12, OS id=28106
Wed Nov 23 09:11:28 2011
starting up 1 shared server(s) ...
Wed Nov 23 09:11:28 2011
ALTER DATABASE   MOUNT
Wed Nov 23 09:11:28 2011
sculkget: failed to lock /opt/oracle/product/10.2.0/db_1/dbs/lkTEST exclusive
sculkget: lock held by PID: 12339
Wed Nov 23 09:11:28 2011
ORA-09968: unable to lock file
Linux-x86_64 Error: 11: Resource temporarily unavailable
Additional information: 12339
Wed Nov 23 09:11:28 2011
ORA-1102 signalled during: ALTER DATABASE   MOUNT...

通过这个试验,再现了网友的ORA-09968, ORA-01102错误

三、MOS解释
ORA-09968, ORA-01102 When Starting a Database