python 爬虫实践
上一篇 /
下一篇 2017-06-21 17:09:34
/ 个人分类:Python实例
#coding=utf-8
import urllib,re
def getHtml(url):
page = urllib.urlopen(url)
html = page.read()
return html
def findPic(html):
myPattern = re.compile(r'src="(.+\.png)"')
imgs = re.findall(myPattern, html)
return imgs
html = getHtml("http://www.cnblogs.com/fnng/p/3576154.html")
imgs = findPic(html)
print(imgs)
x = 0
for img in imgs:
urllib.urlretrieve(img, 'myfile%s.png' % x)
x += 1
urllib.urlretrieve(url, filename, reporthook, data, context)
的参数filename,为了避免每次创建的filename一样造成文件覆盖,添加x作为标志
参考虫师的博客^-^
收藏
举报
TAG:
Python
python