python实现电影日历推荐
今天偶然间看到一个此刻 电影日历-每天一部优秀电影 (cikeee.com)关于电影推荐的网站,感觉做的很不错,很简洁明了,突然希望能够实现邮件给我推送这些推荐的电影
相关技术
- python爬虫:由于网站采用的是静态存储策略,因此采用requests+xpath获取所需字段
- python邮件发送功能:可以参考我之前写的健康打卡成功后的邮件提醒python实现自动健康打卡 - Xherlock
- stmp发送html文本
最后希望邮件格式像这样的忠犬八公的故事电影详情 高清在线观看 网盘下载 磁力下载链接-此刻电影 (cikeee.com)
代码流程
相关函数库
import requests
import base64
import smtplib
from lxml import etree
from email.mime.text import MIMEText
爬虫
往日推荐-此刻 电影日历 (cikeee.com)从这顶上获取最近七天的电影链接
url = 'https://www.cikeee.com/wangri'
resp = requests.get(url).text
html = etree.HTML(resp)
paths = html.xpath('//a[@class="mov-img-a"]/@href')
接着使用for循环遍历请求每个url,获取电影相关信息
到此爬虫部分完成
html拼接
想要返回完整的html代码需要包含head和body,head部分需要拼接的有css,主要有两个css(最开始没有注意到,只发现PC端需要第一个css,后来发现第二个是适配手机版的css)
源代码中定位对应的css文件
复制下来,这里我进行了缩减和删除不需要的css,代码见最后
完整的html代码,包含head的css和body的电影部分,使用{}进行占位,后面字符串format可以进行填充:
单个电影框的html代码,同理使用了{}进行占位,便于后面将爬虫信息填入:
python读取文件为字符串:
使用movie_list接收每个电影的html代码并拼接
中间遇到的两个问题
- 图片存在跨域访问失败,直接拼接出来的url不能用在自己的前端
不想保存图片到本地,因此采用requests获取图像的二进制流后转为base64字符串,注意要将base64加密后的内容再转为字符串格式,否则字符串无法和字节型数据拼接
- 后面发送邮件发现手机格式很混乱
因此将第二个适配版的css也加入了html代码中,相当于两份各自适配终端的css
邮件发送
完整代码
good_movies.py
import requests
import base64
import smtplib
from lxml import etree
from email.mime.text import MIMEText
MAIL_USER = 'xxx@xx.com'
MAIL_PWD = '授权码'
MAIL_TO = 'xxx@xx.com'
def mail(mail_text):
# 设置邮件内容
msg = MIMEText(mail_text, _subtype='html', _charset='utf-8')
# 设置邮件主题、发送方和接收方
msg['Subject'] = "电影日历推荐"
msg['From'] = MAIL_USER
msg['To'] = MAIL_TO
# 发送邮件
send = smtplib.SMTP_SSL("smtp.qq.com", 465)
send.login(MAIL_USER, MAIL_PWD)
send.send_message(msg)
# 退出邮件
send.quit()
def get_movies():
url = 'https://www.cikeee.com/wangri'
resp = requests.get(url).text
html = etree.HTML(resp)
paths = html.xpath('//a[@class="mov-img-a"]/@href')
cnt = 0
page = open('page.html', encoding='utf-8').read()
css = open('movie.css', encoding='utf-8').read()
adaptivte_css = open('movie-adaptive.css', encoding='utf-8').read()
movie = open('movie.html', encoding='utf-8').read()
movie_list = ''
for path in paths:
# 获取字段值
movie_html = etree.HTML(requests.get('https://www.cikeee.com'+path).text)
movie_title = movie_html.xpath('//*[@id="movie-title"]/text()')[0]
rate_box = movie_html.xpath('//*[@id="rate-box"]/@href')[0]
score = movie_html.xpath('//*[@id="rate-box"]/span/text()')[0]
movie_info = movie_html.xpath('//*[@id="movie-information"]')[0].xpath('string(.)').split()
category = movie_info[0]
year = movie_info[1]
country = movie_info[2]
director = movie_info[3]
stars = movie_info[4]
movie_text = movie_html.xpath('//*[@id="movie-text"]/text()')[0]
movie_intro = movie_html.xpath('//*[@id="movie-intro"]/text()')[0]
img_url = 'https://www.cikeee.com'+movie_html.xpath('//*[@class="movie-img"]/@src')[0]
img = 'data:image/jpeg;base64,'+str(base64.b64encode(requests.get(img_url).content), encoding="utf-8")
movie_list += movie.format(movie_title, rate_box, score, category, year, country, director, stars, movie_text, movie_intro, img)
# 获取过去连续7天电影推荐
cnt += 1
if cnt == 7:
break
return page.format(css, adaptivte_css, movie_list)
if __name__ == '__main__':
mail_text = get_movies()
mail(mail_text)
page.html
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport"
content="width=device-width, user-scalable=yes, initial-scale=1.0, maximum-scale=1.0, minimum-scale=0.5">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<style>
{}
</style>
</head>
<body>
<div id="bg"></div>
<div id="banner-wrap">
<h1 id="title">电影日历推荐</h1>
<div id="banner-width">
{}
</div>
</div>
</body>
</html>
movie.html
<div id="movie-banner">
<div id="movie-info-wrap">
<div id="movie-title-box">
<div id="movie-title">{}</div>
<a id="rate-box" href="{}" target="_blank">
<img class="db-icon" src="https://www.cikeee.com/static/images/link/doubangolden.svg">
<span>{}</span>
</a>
</div>
<div id="movie-information" class="">{} {} {} <br>{} {}</div>
<div id="movie-text">{}</div>
<div id="movie-intro">{}</div>
</div>
<div id="movie-img-wrap">
<img crossOrigin="anonymous" class="movie-img" src="{}" alt="电影海报">
</div>
</div>
movie.css
html,body{
width:100%;
height:100%;
margin:0;
padding: 0;
font-family: -apple-system, 'Helvetica Neue', sans-serif;
font-size: 16px;
}
body{
position: relative;
background-color:#1e1e1e; /* 浏览器不支持的时候显示 */
}
#bg{
position: fixed;
left:0;
top:0;
width:100%;
height:100%;
background-color:#1e1e1e; /* 浏览器不支持的时候显示 */
background-image:linear-gradient(45deg,#1a161a 70%,#271f28);
z-index:-3;
}
#title{
color:#f4e5b3;
text-align: center;
margin-top: 10px;
}
/* 滚动条样式 */
::-webkit-scrollbar {
width: 10px;
}
/* 滚动槽 */
::-webkit-scrollbar-track {
-webkit-box-shadow: inset006pxrgba(0, 0, 0, 0.2);
border-radius: 5;
}
/* 滚动条滑块 */
::-webkit-scrollbar-thumb {
border-radius: 15px;
background: rgb(51, 50, 51);
-webkit-box-shadow: inset006pxrgba(0, 0, 0, 0.1);
}
::-webkit-scrollbar-thumb:window-inactive {
background: rgba(0, 0, 0, 0.2);
}
#banner-wrap{
background-color: rgb(10, 8, 10);
background:rgba(0, 0, 0, 0.4);
width:100%;
}
#banner-width{
color:white;
margin:0 auto;
width:90%;
max-width:1600px;
padding-bottom:80px;
}
/* movie-banner________________________________________________ */
#movie-banner{
width:100%;
border:1px solid #f4e5b3;
padding:50px 50px;
box-sizing: border-box;
display: flex;
justify-content: space-between;
min-height:260px;
}
#movie-info-wrap{
margin-right:20px;
}
.movie-img{
width:220px;
border-radius:10px;
}
#movie-title-box{
color:#f4e5b3;
display: flex;
align-items: center;
}
#movie-title{
font-size:36px;
margin-right:15px;
line-height: 0px;
}
#rate-box{
font-size:14px;
color: #f4e5b3;
text-decoration: none;
background-color:rgba(255, 251, 197, 0.1);
border-radius:100px;
display: flex;
align-items: center;
margin:0;
line-height: 0;
padding:4px 5px;
user-select: none;
-ms-user-select: none;
transform: translate(0,2px);
transition: all .5s;
}
#rate-box:hover{
background-color:rgba(255, 251, 197, 0.2);
}
#rate-box span{
font-weight:lighter;
}
.db-icon{
width:16px;
margin-right:5px;
}
#movie-information{
font-size:16px;
color:#f4e5b3;
margin-top:16px;
}
#movie-text{
font-size:22px;
margin-top: 45px;
transform: translate(-12px,0);
}
#movie-intro{
font-size:14px;
margin-top:10px;
max-width:90%;
}
movie-adaptive.css
@media (max-width:640px){
/* movie-banner________________________________________________ */
#banner-width{
width:85%;
padding-bottom:40px;
}
#movie-banner{
padding:20px 20px;
min-height:260px;
flex-flow:column-reverse wrap;
justify-content:center;
align-items: center;
}
#movie-info-wrap{
margin-right:0px;
}
#movie-title-box{
justify-content: center;
align-items: center;
}
#movie-title{
font-size:25px;
margin-right:15px;
}
.movie-img{
width:150px;
border-radius:5px;
margin-bottom:20px;
}
#movie-text{
margin-top: 45px;
transform: translate(0,0);
}
#movie-intro{
margin-top:10px;
max-width:100%;
}
}
部署到服务器
宝塔面板的计划任务
最终效果
PC端:
手机端: