乱码小结-白红宇的个人博客

乱码小结

发布日期：2021-09-23 17:40:47 浏览次数：2 分类：技术文章

本文共 4007 字，大约阅读时间需要 13 分钟。

一、简单介绍下pageEncoding与contentType的区别（摘自网上）

pageEncoding：设置JSP源文件和响应正文中的字符集编码。

contentType：设置JSP源文件和响应正文的字符集编码及MIME类型。

如果pageEncoding属性存在，那么JSP页面的字符编码方式就由pageEncoding决定，

否则就由contentType属性中的charset决定，如果charset也不存在，JSP页面的字符编码方式就采用默认的ISO-8859-1。

二、环境测试演练

post请求

1、条件：html或者jsp中的charset :text/html;charset=gb2312 （即浏览器页面编码方式）

没有filter，

server.xml没有设置

比如request.getParamter ,"汪" 转换成 Íô new String("汪".getBytes("gb2312"),"iso-8859-1")

结论：这是因为gb2312编码的“汪”，经iso-8859-1编码为Íô.

因此需要转换 new String("Íô".getBytes("iso-8859-1"),"gb2312");

2、条件：在1的基础上，server.xml增加

uRIEncoding="gbk" useBodyEncodingForURI="true" 不起作用。

并多次修改其值，对1中结果没有影响。

结论：故推断，此设置不对post请求起作用。

3、条件：在1的基础上，增加fiter， request.setCharacterEncoding("gb2312");

那么request.getParameter("name")，直接获得正常的值。

结论： request.setCharacterEncoding 需要在request.getParameter("name")之前设置。具体可参考英文文档。

4、条件：在3的基础上，修改request.setCharacterEncoding("utf-8");

那么，汉字"汪"就不会正常解析出来，因为 "汪"先经过gb2312--->iso-8859-1---->utf-8 编码

若反过来，是不可逆的。锟斤拷 new String(new String(name.getBytes("utf-8"),"iso-8859-1").getBytes("iso-8859-1"),"gb2312");

结论：因此不要设置两个或者两个以上不同的编码，导致编码复杂，可能造成不可逆。

get请求：

1、条件：jsp 中 pageEncoding="UTF-8" charset=UTF-8"，filter中 request.setCharacterEncoding("utf-8");

<a href="<%=request.getContextPath() %>/login.do?method=validate&username=汪&password=121322"">测试</a>

分析:汪----utf-8----iso-8859-1----request.setCharacterEncoding("utf-8") 不起作用--

因此 String name=new String(request.getParameter("username").getBytes("iso-8859-1"),"utf-8");才能显示正常的中文。

2、浏览器地址栏手动敲链接

那么汪---GBK---ISO-8859-1

因此String name=new String(request.getParameter("username").getBytes("iso-8859-1"),"GBK");为正常中文。

3、在1的基础上，

3.1）去除过滤器filter

3.1.1）只有 uRIEncoding="UTF-8"

"汪"------utf-8----iso-8859-1---通过tomcat server.xml ---utf-8

因此request.getParameter("username")为正常中文。

3.1.2) 只有useBodyEncodingForURI="true"

同3.1.3

3.1.3）server.xml 增加uRIEncoding="UTF-8" useBodyEncodingForURI="true"

汪----utf-8----iso-8859-1

因此 String name=new String(request.getParameter("username").getBytes("iso-8859-1"),"utf-8"); 为正常中文

3.2）增加过滤器filter request.setCharacterEncoding("utf-8")

3.2.1）只有 uRIEncoding="UTF-8"

"汪"------utf-8----iso-8859-1---通过tomcat server.xml ---utf-8，注意filter不起作用

因此request.getParameter("username")为正常中文。

3.2.2) 只有useBodyEncodingForURI="true"

"汪"------utf-8----iso-8859-1---过滤器 ---utf-8

因此request.getParameter("username")为正常中文。即同3.1.1

3.2.3）server.xml 增加uRIEncoding="UTF-8" useBodyEncodingForURI="true"

同上

4、在2的基础上

4.1）去除过滤器fiter

4.1.1）只有 uRIEncoding="UTF-8"

汪------GBK-----ISO-8859-1---通过tomcat server.xml ---utf-8

request.getParameter("username"); 乱码。若逆向编码的不可逆，

new String(new String(request.getParameter("username").getBytes("utf-8"),"iso-8859-1").getBytes("iso-8859-1"),"gbk")----锟斤拷

4.1.2)只有useBodyEncodingForURI="true"

同4.1.3

4.1.3）server.xml 增加uRIEncoding="UTF-8" useBodyEncodingForURI="true"

汪------GBK-----ISO-8859-1

因此 name=new String(request.getParameter("username").getBytes("iso-8859-1"),"GBK");

4.2)增加过滤器request.setCharacterEncoding("utf-8");

4.2.1）只有 uRIEncoding="UTF-8"

汪------GBK-----ISO-8859-1---通过tomcat server.xml ---utf-8

request.getParameter("username"); 乱码。若逆向编码的不可逆，

new String(new String(request.getParameter("username").getBytes("utf-8"),"iso-8859-1").getBytes("iso-8859-1"),"gbk")----锟斤拷

4.2.2)只有useBodyEncodingForURI="true"

同4.1.1

4.2.3）server.xml 增加uRIEncoding="UTF-8" useBodyEncodingForURI="true"

同上

5、server.xml 增加uRIEncoding="UTF-8" useBodyEncodingForURI="true"，无论设不设置过滤器,多次修改uRIEncoding="UTF-8"的编码。

对结果无影响，useBodyEncodingForURI 优先级别比增加uRIEncoding 高。

总结：

1）request.setCharacterEncoding；只对post请求、get请求（server.xml中配置useBodyEncodingForURI="true"）起作用。

2）server.xml 增加uRIEncoding="UTF-8" useBodyEncodingForURI="true" 只对get请求起作用。

3）地址栏手动链接。根据系统编码gbk----iso-8859-1

4)request.setCharacterEncoding 应该设置在request.getParameter之前。

5）对server.xml只有uRIEncoding="UTF-8"配置的情况下。会对get请求uri进行utf-8编码

6）若useBodyEncodingForURI="true" 优先级比uRIEncoding="UTF-8"高。即设置前者后，后者不起作用。

因此建议最好 jsp 中pageEncoding，contentType，server.xml uRIEncoding="UTF-8" useBodyEncodingForURI="true" fileter为request.setCharacterEncoding,保持一致。

转载地址：https://blog.csdn.net/awj321000/article/details/51495759 如侵犯您的版权，请留言回复原文章的地址，我们会给您删除此文章，给您带来不便请您谅解！

上一篇：Ehcache 整合Spring 使用页面、对象缓存

下一篇：Java中Synchronized的用法

发表评论

关于作者

喝酒易醉，品茶养心，人生如梦，品茶悟道，何以解忧？唯有杜康！

-- 愿君每日到此一游！

发表评论

最新留言

关于作者

推荐文章