最近有位朋友一直在为exp/imp操作的乱码问题纠结,总是搞不清楚为什么,而且经常莫名其妙的出现乱码,为此我做了一个实验,来说明这个问题的处理思路
一、准备工作
C:\Users\XIFENFEI>sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on 星期四 11月 17 18:43:00 2011
Copyright (c) 1982, 2010, Oracle. All rights reserved.
连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
SQL> create table TEST_EXP
2 (
3 A1 NUMBER,
4 A2 VARCHAR2(10 CHAR),
5 A3 VARCHAR2(10),
6 A4 NVARCHAR2(10),
7 A5 CHAR(10),
8 A6 NCHAR(10)
9 );
表已创建。
SQL> comment on column TEST_EXP.A1
2 is '数字类型----惜分飞';
注释已创建。
SQL> comment on column TEST_EXP.A2
2 is 'varchar类型1----惜分飞';
注释已创建。
SQL> comment on column TEST_EXP.A3
2 is 'varchar类型2----惜分飞';
注释已创建。
SQL> comment on column TEST_EXP.A4
2 is 'nvarchar类型----惜分飞';
注释已创建。
SQL> comment on column TEST_EXP.A5
2 is 'char类型----惜分飞';
注释已创建。
SQL> comment on column TEST_EXP.A6
2 is 'nchar类型----惜分飞';
注释已创建。
SQL> insert into test_exp values(1,'xifenfeicf','xifenfeicf','xff','xifenfei','xifenfei');
已创建 1 行。
SQL> insert into test_exp values(1,'惜分飞来向大家问好啦',
2 '杭州惜分飞','杭州惜分飞','杭州惜分飞','杭州惜分飞');
已创建 1 行。
SQL> commit;
提交完成。
SQL> col parameter for a30
SQL> col value for a20
SQL> select * FROM v$nls_parameters WHERE parameter LIKE '%CHARACTERSET%';
PARAMETER VALUE
------------------------------ --------------------
NLS_CHARACTERSET ZHS16GBK
NLS_NCHAR_CHARACTERSET AL16UTF16
SQL> exit
从 Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options 断开
C:\Users\XIFENFEI>exp chf/xifenfei tables=test_exp
file=d:\test_exp.dmp log=d:\test_exp.log
Export: Release 11.2.0.1.0 - Production on 星期四 11月 17 18:46:10 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
已导出 ZHS16GBK 字符集和 AL16UTF16 NCHAR 字符集
即将导出指定的表通过常规路径...
. . 正在导出表 TEST_EXP导出了 2 行
成功终止导出, 没有出现警告。
二、使用AL32UTF8编码导入
C:\Users\XIFENFEI>set NLS_LANG=american_america.AL32UTF8
C:\Users\XIFENFEI>imp chf/xifenfei tables=test_exp
file=d:/test_exp.dmp log=d:/test_exp.log fromuser=chf touser=chf
Import: Release 11.2.0.1.0 - Production on Thu Nov 17 19:24:58 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Produc
tion
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
Export file created by EXPORT:V11.02.00 via conventional path
import done in AL32UTF8 character set and AL16UTF16 NCHAR character set
import server uses ZHS16GBK character set (possible charset conversion)
export client uses ZHS16GBK character set (possible charset conversion)
. importing CHF's objects into CHF
. . importing table "TEST_EXP" 2 rows imported
Import terminated successfully without warnings.
--注意此处提示,编码发生了转换
--导出来文件编码为:ZHS16GBK
--现在客户端编码为:AL32UTF8
--导入服务器编码为:ZHS16GBK
--现在的转换是ZHS16GBK-->AL32UTF8 -->ZHS16GBK
--其中ZHS16GBK-->AL32UTF8说成转换也许不太合适
--(因为ZHS16GBK是已经生产的dmp文件中数据的编码,而AL32UTF8是导入客户端的编码,这个到底是否转换待定)
C:\Users\XIFENFEI>sqlplus chf/xifenfei
SQL*Plus: Release 11.2.0.1.0 Production on Thu Nov 17 19:25:58 2011
Copyright (c) 1982, 2010, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
SQL> col comments for a30
SQL> SELECT COLUMN_NAME,comments FROM DBA_COL_COMMENTS WHERE owner='CHF' AND TABLE_NAME='TEST_EXP';
COLUMN_NAME COMMENTS
------------------------------ ------------------------------
A1 数字类型----惜分飞
A2 varchar类型1----惜分飞
A3 varchar类型2----惜分飞
A4 nvarchar类型----惜分飞
A5 char类型----惜分飞
A6 nchar类型----惜分飞
6 rows selected.
SQL>select * from test_exp;
A1 A2 A3 A4 A5 A6
---------- -------------------- ---------- -------------------- ---------- --------------------
1 xifenfeicf xifenfeicf xff xifenfei xifenfei
1 惜分飞来向大家问好啦 杭州惜分飞 杭州惜分飞 杭州惜分飞 杭州惜分飞
--在新窗口查询,编码修改客户端编码造成影响
三、使用US7ASCII编码导入
C:\Users\XIFENFEI>set NLS_LANG=american_america.US7ASCII
C:\Users\XIFENFEI>imp chf/xifenfei tables=test_exp
file=d:/test_exp.dmp log=d:/test_exp.log fromuser=chf touser=chf
Import: Release 11.2.0.1.0 - Production on Thu Nov 17 19:35:10 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Produc
tion
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
Export file created by EXPORT:V11.02.00 via conventional path
import done in US7ASCII character set and AL16UTF16 NCHAR character set
import server uses ZHS16GBK character set (possible charset conversion)
export client uses ZHS16GBK character set (possible charset conversion)
. importing CHF's objects into CHF
. . importing table "TEST_EXP" 2 rows imported
Import terminated successfully without warnings.
SQL> col comments for a30
SQL> SELECT COLUMN_NAME,comments FROM DBA_COL_COMMENTS WHERE owner='CHF' AND TABLE_NAME='TEST_EXP';
COLUMN_NAM COMMENTS
---------- ------------------------------
A1 ????----???
A2 varchar??1----???
A3 varchar??2----???
A4 nvarchar??----???
A5 char??----???
A6 nchar??----???
6 rows selected.
--sqlplus和plsql dev中均为乱码
SQL> select * from test_exp;
A1 A2 A3 A4 A5 A6
---------- ---------- ---------- ---------- ---------- ----------
1 xifenfeicf xifenfeicf xff xifenfei xifenfei
1 ?????????? ????? ????? ????? ?????
--在plsql dev中查询是正常,sqlplus中不正常
--这里为什么plsql dev中能够显示正常,而comment在plsql dev中显示不正常,还有待研究
--说明:这里由于ZHS16GBK转换为US7ASCII的过程不能识别汉字,所以会导致汉字变成了问号
四、使用ZHS16GBK编码
C:\Users\XIFENFEI>set NLS_LANG=SIMPLIFIED CHINESE_CHINA.ZHS16GBK
C:\Users\XIFENFEI>imp chf/xifenfei tables=test_exp
file=d:/test_exp.dmp log=d:/test_exp.log fromuser=chf touser=chf
Import: Release 11.2.0.1.0 - Production on 星期四 11月 17 20:26:39 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
经由常规路径由 EXPORT:V11.02.00 创建的导出文件
已经完成 ZHS16GBK 字符集和 AL16UTF16 NCHAR 字符集中的导入
. 正在将 CHF 的对象导入到 CHF
. . 正在导入表 "TEST_EXP"导入了 2 行
成功终止导入, 没有出现警告。
--注意提示,没有发生任何的编码转换
QL> col comments for a30
SQL> SELECT COLUMN_NAME,comments FROM DBA_COL_COMMENTS WHERE owner='CHF' AND TABLE_NAME='TEST_EXP';
COLUMN_NAME COMMENTS
------------------------------ ------------------------------
A1 数字类型----惜分飞
A2 varchar类型1----惜分飞
A3 varchar类型2----惜分飞
A4 nvarchar类型----惜分飞
A5 char类型----惜分飞
A6 nchar类型----惜分飞
6 rows selected.
SQL>select * from test_exp;
A1 A2 A3 A4 A5 A6
---------- -------------------- ---------- -------------------- ---------- --------------------
1 xifenfeicf xifenfeicf xff xifenfei xifenfei
1 惜分飞来向大家问好啦 杭州惜分飞 杭州惜分飞 杭州惜分飞 杭州惜分飞
五、原因分析,解决建议
在导入过程中,最多会发生三次编码转换:
1、执行exp时,数据库中数据的编码会转换为导出客户端编码
2、执行imp时,dmp文件的编码转换为导入客户端编码
3、导入客户端编码转换为目标端数据库的数据库编码
在exp/imp操作的过程中,经常出现乱码的原因就是编码的相互转换的过程中出现了丢失或者相互不能转换导致。要解决这个问题,最好的办法就是通过NLS_LANG的灵活设置,减少编码转换的次数(如果相邻的转换操作编码一致,那么不会发生编码转换,如试验中的ZHS16GBK编码测试,就没有转换发生),或者使得相互的转换能够兼容,可以最大程度的减少乱码的出现。
如果已经有了exp导出的dmp文件,然后在导入的过程中,出现乱码,一般的处理建议是nls_lang的编码设置和dmp文件的一致,让转换发生在导入客户端和数据库服务器间(要求:编码可以相互转换)