Python爬取壁纸

Xherlock

2021 年 10 月 04 日

1444 次浏览

暂无评论

2648字数

python 爬虫

# **Python爬取壁纸**

**以3G壁纸网为例学习bs64爬取并下载壁纸，地址：**[**电脑桌面壁纸大全-高清电脑桌面壁纸图片-超高清壁纸桌面免费下载-3g壁纸 (3gbizhi.com)**](https://desk.3gbizhi.com/)

**首先看要爬取的图片名**

![image-20211004233610791.png](http://120.78.215.15/usr/uploads/2021/10/2928615414.png)

**ctrl+U查看源代码锁定代码块**

![image-20211004233816305.png](http://120.78.215.15/usr/uploads/2021/10/3891708269.png)

**选取div 下的contlistw的class属性读取，找到这个div中包含的所有a标签**

```
main_page = BeautifulSoup(resp.text, "html.parser")
alist = main_page.find("div", class_="contlistw").find_all("a")
```

**for循环进入a标签中获取href中的子页面路径**

```
for a in alist:
    href = a.get('href')
    child_page_resp = requests.get(href)
    child_page_resp.encoding = 'utf-8'
    child_page_text = child_page_resp.text
```

**以第一个子页面为例，对壁纸进行定位**

![image-20211004234209615.png](http://120.78.215.15/usr/uploads/2021/10/208120348.png)

**选取div 下的showcontw mtw的class属性读取，找到这个div中包含的所有img标签**

```
child_page = BeautifulSoup(child_page_text, "html.parser")
    showing = child_page.find("div", class_="showcontw mtw")
```

**接下来抛出一些读取错误，因为首页后面还有跟contlistw属性一样的部分，导致没能读取壁纸而报错**

```
try:
    img = showing.find("img")
    src = img.get("src")
except AttributeError as e:
    continue
```

**最后获取图片路径并将字节写入文件**

```
img_resp = requests.get(src)
    img_name = str(i) + ".jpg"
    with open("image/" + img_name, mode="wb") as f:
        f.write(img_resp.content)
```

**命名后放入当前目录下的image文件夹中**

**完整代码如下：**

```
import time
import requests
from bs4 import BeautifulSoup

url = "https://desk.3gbizhi.com/"
resp = requests.get(url)
resp.encoding = 'utf-8'
main_page = BeautifulSoup(resp.text, "html.parser")
alist = main_page.find("div", class_="contlistw").find_all("a")
i = 1
for a in alist:
    href = a.get('href')
    child_page_resp = requests.get(href)
    child_page_resp.encoding = 'utf-8'
    child_page_text = child_page_resp.text
    child_page = BeautifulSoup(child_page_text, "html.parser")
    showing = child_page.find("div", class_="showcontw mtw")
    try:
        img = showing.find("img")
        src = img.get("src")
    except AttributeError as e:
        continue
    img_resp = requests.get(src)
    img_name = str(i) + ".jpg"
    with open("image/" + img_name, mode="wb") as f:
        f.write(img_resp.content)
    print("Over!!!", img_name)
    time.sleep(1)
    i = i + 1
print("All over!")

```

**效果图：**

**网站首页的20张壁纸**

![image-20211004234646833.png](http://120.78.215.15/usr/uploads/2021/10/2334245188.png)

**我爬下来的壁纸，不多不少也是20张**

![image-20211004234735466.png](http://120.78.215.15/usr/uploads/2021/10/2870553344.png)

最后修改：2022 年 12 月 29 日

如果觉得我的文章对你有用，请随意赞赏

Python爬取壁纸

Xherlock • 2021 年 10 月 04 日

Python爬取壁纸

发表评论取消回复
使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款

Reverse（十一）

Reverse（十三）

Win11新机鼓捣记录

js逆向爬虫1

HTML1

Python实现文件管理系统（重命名和删除）

并发：死锁和饥饿

N1CTF 2025 wp

Pytorch学习2

网络空间安全技术体系与技术

Python爬取壁纸

发表评论 取消回复 使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款

Python爬取壁纸

发表评论取消回复
使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款