home archives github knives links
tags python 爬虫
categories
only title title and content
scrapy笔记

安装

sudo pip3 install scrapy

报错

pyopenssl 19.1.0 has requirement cryptography>=2.8, but you'll have cryptography 2.3 which is incompatible.

创建项目

scrapy startproject project_name

将会创建一个project_name文件夹,文件夹结构

project_name/-+-project_name/-+-spiders/---__init__.py
| |
| +-items.py
| |
| +-middlewares.py
| |
| +-pipelines.py
| |
| +-settings.py
| |
| `-__init__.py
|
`-scrapy.cfg

构建item模型

project_name/items.py

制作爬虫

scrapy genspider fuck "itcast.cn" # 示例网站

将在project_name/spiders/下创建fuck.py并初始化代码框架