动态网页用python爬虫解析的方法

发布时间：2020-11-30 10:46:50 阅读：181 作者：小新栏目：编程语言

Python开发者专用服务器限时活动，0元免费领，库存有限，领完即止！点击查看>>

这篇文章将为大家详细讲解有关动态网页用python爬虫解析的方法，小编觉得挺实用的，因此分享给大家做个参考，希望大家阅读完这篇文章后可以有所收获。

json是一种数据存储格式，可以被多种语言解析，一般用于数据传输。

data = json.loads(html_str);
all_items=data['topic_list']['topics'] write_content=[];
for item in all_items:
     slug = item['slug'];
     item_id = item['id']
     link = f'https://forum.cocos.org/t/{slug}/{item_id}'
     title = item['title'];
     like_count = item['like_count'];
     like_count = item['like_count'];
     posts_count = item['posts_count'];
     views = item['views'];
     created_at = item['created_at'];
     write_content.append({'标题': title, '链接': link, '点赞':like_count, '回复':posts_count, '浏览':views, '发帖时间':created_at});

其中的链接地址可以通过打开几个论坛内容找到规律，是由 slug 和 id 这两个字段拼接的。

动态网页用python爬虫解析的方法

最后使用多线程和 csv 存储结果。

pool = Pool(3);
orign_num=[x for x in range(0,10)];
result = pool.map(scrapy,orign_num);  
with open('ccc_title_link.csv', 'w', newline='') as csvfile:
     fieldnames = ('标题', '链接', '点赞', '回复','浏览', '发帖时间')
     writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
     writer.writeheader()
     for write_content in result:
         for _content in write_content:
             writer.writerow(_content);

关于动态网页用python爬虫解析的方法就分享到这里了，希望以上内容可以对大家有一定的帮助，可以学到更多知识。如果觉得文章不错，可以把它分享出去让更多的人看到。

亿速云「云服务器」，即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘，价格低至29元/月。点击查看>>

向AI问一下细节

动态网页用python爬虫解析的方法

猜你喜欢

最新资讯

相关推荐

开发者交流群：

相关标签