本文共 3778 字,大约阅读时间需要 12 分钟。
文章目录
Oracle沟通细节
提单之后,Oracle跟我联系沟通。截取部分邮件内容,仅供参考。
现象:
某系统上线后TCP closewait状态持续大量增涨,触发告警。
定位过程:
两点半,部署了一个节点的问题tag,用来复现问题。
Step1:
查看谁与谁的连接产生了大量CLOSE_WAIT,因为系统还会调第三方,总之要确认连接建立的双方。
命令: netstat -np | grep tcp|grep “CLOSE_WAIT”
结果:
总之: 发现了三个端口是334的IP是导火索。Step2:
这三个IP具体是谁?具体是请求了哪个接口?
暂时运维无法提供!最直接的导火索暂时断了线索。我从侧面开始观察
Step3 抓包:
不得已,要抓包获取更多线索了。对于很久没有碰过网络的我,有些吃力。
得到线索:发现大量的RST那么是什么操作会导致CLOSE_WAIT呢?什么杨的连接导致大量RST呢()?
Step4:
运维的协助,得知这三个IP是图片CDN。
至此,可以定位到代码。因为图片CDN请求可以排查代码。分析了业务方的源码后,推测因为服务器 发起 URL请求,请求不存在,导致抛出异常,但是JDK中却没有地方关闭Socket。
/** * Returns aBufferedImage
as the result of decoding * a suppliedURL
with anImageReader
* chosen automatically from among those currently registered. An *InputStream
is obtained from theURL
, * which is wrapped in anImageInputStream
. If no * registeredImageReader
claims to be able to read * the resulting stream,null
is returned. * *The current cache settings from
getUseCache
and *getCacheDirectory
will be used to control caching in the *ImageInputStream
that is created. * *This method does not attempt to locate *
ImageReader
s that can read directly from a *URL
; that may be accomplished using *IIORegistry
andImageReaderSpi
. * * @param input aURL
to read from. * * @return aBufferedImage
containing the decoded * contents of the input, ornull
. * * @exception IllegalArgumentException ifinput
is *null
. * @exception IOException if an error occurs during reading. */ public static BufferedImage read(URL input) throws IOException { if (input == null) { throw new IllegalArgumentException("input == null!"); } InputStream istream = null; try {//此处,建立TCP连接!并且直接获取流,因为流数据不存在,进入cache块,抛出! istream = input.openStream(); } catch (IOException e) { throw new IIOException("Can't get input stream from URL!", e); } ImageInputStream stream = createImageInputStream(istream); BufferedImage bi; try { bi = read(stream); if (bi == null) { stream.close(); } } finally { istream.close(); } return bi; }
可以看到JDK并没有关闭 ImageIO.read(url) 代码中封装的Socket连接!CDN会请求超时关闭导致服务器处于CLOSE_WAIT?限于网络经验有限,并不能100%确认我的想法。所以模拟下吧
Step5 复现与模拟
public static void main(String[] args) throws InterruptedException { ExecutorService ex = Executors.newFixedThreadPool(100); for (int i = 0; i < 5000; i++) { ex.execute(task()); } } /** * @throws IOException * @throws MalformedURLException */ //以下代码 摘自业务源码 private static Runnable task() { return new Runnable() { @Override public void run() { // domain must exists,but file doesnot. //因为公司政策原因原始url屏蔽,随便在网上找了个图片,然后乱写后几位url String xxxUrl = "https://youer.chazidian.com/images/Articles/month_1512/201512fffff031343221006.jpg"; File file = null; BufferedImage image = null; try { file = File.createTempFile("abc", "jpg"); URL url1 = new URL(xxxUrl); image = ImageIO.read(url1); } catch (Throwable e) { // TODO Auto-generated catch block e.printStackTrace(); } finally { if (null != file) { file.delete(); } if (null != image) { image.flush(); image = null; } } } }; }
运行后抓包:
TCP查看
问题复现!
疑点与不足
TCP状态机的流转不够熟悉透彻!导致很多问题不能从TCP状态机本身推倒,因为自己不确定。网上也有很多乱讲的
转载地址:https://lemon.blog.csdn.net/article/details/85626075 如侵犯您的版权,请留言回复原文章的地址,我们会给您删除此文章,给您带来不便请您谅解!