怎么用Python爬取酷狗音乐TOP500

发布时间：2021-09-14 16:32:22 阅读：274 作者：chen 栏目：大数据

Python开发者专用服务器限时活动，0元免费领，库存有限，领完即止！点击查看>>

这篇文章主要介绍“怎么用Python爬取酷狗音乐TOP500”，在日常操作中，相信很多人在怎么用Python爬取酷狗音乐TOP500问题上存在疑惑，小编查阅了各式资料，整理出简单好用的操作方法，希望对大家解答”怎么用Python爬取酷狗音乐TOP500”的疑惑有所帮助！接下来，请跟着小编一起来学习吧！

上面是网址，

怎么用Python爬取酷狗音乐TOP500

改变数字就可以实现翻页，所以这个不能翻页的问题解决了。然后就是老套路按F12查看找network.

怎么用Python爬取酷狗音乐TOP500

往下翻，发现这些都有注释，那就更好办了。

解析这个数据，拿出来hash值和filename，歌词lyric。

也没什么要说的了，直接贴代码

import requestsfrom lxml import etreeimport jsonimport reimport osclass kugou():    def startkugou(self):        for i in range(23, 24):            print(i)            res = requests.get('https://www.kugou.com/yy/rank/home/%s-8888.html?from=rank' % str(i))            self.get_song(res)    def get_song(self, res):        html = etree.HTML(res.content.decode('utf8'))        content = html.xpath('//script[10]')        content2 = content[0].text        # 解析出json列表，类型是str        content1 = content2.split('global.features =')[1].split('(function()')[0].strip()[0:-1]        try:            # 转换成json数据            content = json.loads(content1)            for i in range(len(content)):                hash = content[i]["Hash"]                file_name = content[i]["FileName"]                hash_url = "http://www.kugou.com/yy/index.php?r=play/getdata&hash=" + hash                hash_content = requests.get(hash_url)                play_url = ''.join(re.findall('"play_url":"(.*?)"', hash_content.text))                lyrics = ''.join(re.findall('"lyrics":"(.*?)"', hash_content.text))                real_download_url = play_url.replace("\\", "")                try:                    # if os.path.exists('kugou/' + file_name + '.txt'):                    #     print(file_name + " 歌词已经存在")                    #     # continue                    # else:                    with open('kugou/' + file_name + '.txt', 'w', encoding='utf8')as f:                        f.write(lyrics.encode('utf8').decode('unicode_escape'))                    print(file_name + "歌词已下载完成！")                    # if os.path.exists('kugou/' + file_name + '.mp3'):                    #     print(file_name+" 歌曲已经存在")                    #     # continue                    # else:                    with open('kugou/' + file_name + ".mp3", "wb")as fp:                        fp.write(requests.get(real_download_url).content)                    print(file_name + "歌曲已下载完成！")                except OSError as e:                    print("出现异常" + file_name)                    file_name = self.validateTitle(file_name)                    # if os.path.exists('kugou/' + file_name + '.txt'):                    #     print(file_name + " 歌词已经存在")                    #     # continue                    # else:                    with open('kugou/' + file_name + '.txt', 'w', encoding='utf8')as f:                        f.write(lyrics.encode('utf8').decode('unicode_escape'))                    print(file_name + "歌词已下载完成！")                    # if os.path.exists('kugou/' + file_name + '.mp3'):                    #     print(file_name + " 歌曲已经存在")                    #     # continue                    # else:                    with open('kugou/' + file_name + ".mp3", "wb")as fp:                        fp.write(requests.get(real_download_url).content)                    print(file_name + "歌曲已下载完成！")        except json.decoder.JSONDecodeError as e:            print(e)            print(content2)            content1 = content2.split('global.features =')[1].strip().split('(function() {')[0].strip()            content1 = content1[0:-1]            print(content1)    def validateTitle(self, file_name):        """ 将 title 名字 规则化        :param title: title name 字符串        :return: 文件命名支持的字符串        """        rstr = r"[\=\(\)\,\/\\\:\*\?\"\<\>\|\' ']"  # '= ( ) ， / \ : * ? " < > |  '   还有空格        new_title = re.sub(rstr, "_", file_name)  # 替换为下划线        return new_titleif __name__ == '__main__':    kugou().startkugou()

怎么用Python爬取酷狗音乐TOP500

到此，关于“怎么用Python爬取酷狗音乐TOP500”的学习就结束了，希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习，快去试试吧！若想继续学习更多相关知识，请继续关注亿速云网站，小编会继续努力为大家带来更多实用的文章！

亿速云「云服务器」，即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘，价格低至29元/月。点击查看>>

向AI问一下细节

怎么用Python爬取酷狗音乐TOP500

猜你喜欢

最新资讯

相关推荐

开发者交流群：

相关标签