360搜狗搜索内容200重定向真实链接获取
目录
360、搜狗搜索内容200重定向真实链接获取
参考:https://www.52pojie.cn/thread-769952-1-1.html
注意:360快照必须带上refer请求才能正常打开看
搜狗正常快照链接可以直接打开
解决方式:
1、可以直接快照中得到
=============================下面是我用正则从快照中拿到的真实链接==================================
http://snapshot.sogoucdn.com/websnapshot?ie=utf8&url=http%3A%2F%2Fwww.51hei.com%2Fmcu%2F4342.html&did=faab698d886364e8-2d63d16c82f515a2-a9709b9de8ee16fa3a8d1afebbe07555&k=ad067682fe8f51dd446e49ed6e911ca4&encodedQuery=ascii%E7%A0%81%E5%AF%B9%E7%85%A7%E8%A1%A8&query=ascii%E7%A0%81%E5%AF%B9%E7%85%A7%E8%A1%A8&&p=40040108&dp=1&cid=&w=01020400&m=0&st=0
======搜狗re解码=====
#获取URL所在位置
selective_kz=re.findall('url=(http.*?htm[a-z]?)',url)
#解码成正确的URL
re.sub('%3A',':',(re.sub('%2F',"/",selective_kz[0])))
========================上面是搜狗,下面是360=======(快照)=================
http://c.360webcache.com/c?m=dc140fe3b217aa80ed232d5fed293a16&q=%E5%86%92%E9%99%A9%E7%8E%8B&u=http%3A%2F%2Fwww.7k7k.com%2Fflash%2F98413.htm
======360re解码=====
#获取URL所在位置
selective_kz=re.findall('u=(http.*?htm[a-z]?)',url)
#解码成正确的URL
re.sub('%3A',':',(re.sub('%2F',"/",selective_kz[0])))
2、直接请求链接text里得到
In [17]: response=requests.get("http://www.so.com/link?m=aExMT69UVjlZUNIsickSvIdbGS7fbK4rmHk9cBpRqOTQWcJJthjqmIhM816MSX
...: VXQ%2FaIlWCK5alPOFErPrMnNi%2FnoxVkBw8b%2BsYEXQEAFzoBLC2vRTUzYk9uP7aWdgBshdyXG0Y%2BHNl5g2Xgn3rPaJJtA2YITsBj78rV
...: EWFoljhBZEImAue3SlA%3D%3D")
In [18]: response.status_code
Out[18]: 200
In [19]: response.text
Out[19]: ' <meta content="always" name="referrer">\n <script>window.location.replace("http://su.ganji.com/fuwu_dian/2551035025x/guoneiwuliu/")</script>\n <noscript>\n <meta http-equiv="refresh" content="0;URL=\'http://su.ganji.com/fuwu_dian/2551035025x/guoneiwuliu/\'">\n </noscript>\n \n'