home archives github knives links
tags python slenium phantomjs
categories
only title title and content
python爬虫

requests(静态html/json)

参数设置

cookies

headers

headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/79.0.3945.79 Chrome/79.0.3945.79 Safari/537.36',
'Host':'music.163.com',
'Referer':'https://music.163.com'
}

selenium + phantomjs(js渲染)

安装

sudo apt install phantomjs
python3 -m pip install selenium # 不要直接用pip3

beautifulSoup4 + lxml(处理html文本)

python3 -m pip install beautifulSoup4
python3 -m pip install lxml

find函数: CSDN