文章詳情頁

python爬取”頂點小說網“《純陽劍尊》的示例代碼

瀏覽：18日期：2022-07-08 09:42:02

爬取”頂點小說網“《純陽劍尊》

代碼

import requestsfrom bs4 import BeautifulSoup# 反爬headers = { ’User-Agent’: ’Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36’}# 獲得請求def open_url(url): response = requests.get(url, headers=headers) response.encoding = response.apparent_encoding html = response.text return html# 提取標題def get_title(url): soup = BeautifulSoup(url, ’lxml’) title_tag = soup.find(’dd’) title = ’n’ + title_tag.h1.get_text() + ’n’ return title# 提取文本def get_texts(url): soup2 = BeautifulSoup(url, ’lxml’) text_tags = soup2.find_all(’dd’, id='contents') return text_tags# 保存標題def save_title(filename, title): with open(filename, ’a+’, encoding=’utf-8’) as file: file.write(title)# 保存文本def save_text(filename, text): with open(filename, ’a+’, encoding=’utf-8’) as file: file.write(text)# 主程序函數def main(): num = input(’《純陽劍尊》你想要下載第幾章？（1-802）’) num = int(num) number = 8184027 + num url = ’https://www.23us.so/files/article/html/15/15905/’ + str(number) + ’.html’ filename = ’純陽劍尊.txt’ r = open_url(url) title = get_title(r) tags = get_texts(r) save_title(filename, title) for text_tag in tags: text = text_tag.get_text() + ’n’ save_text(filename, text) print(’第{}章已經下載完成！’.format(num))if __name__ == ’__main__’: main()

爬取結果：

python爬取”頂點小說網“《純陽劍尊》的示例代碼

以上就是python爬取”頂點小說網“《純陽劍尊》的示例代碼的詳細內容，更多關于python 爬取頂點小說網的資料請關注好吧啦網其它相關文章！

Python 編程

上一條：Python通過getattr函數獲取對象的屬性值下一條：Python使用內置函數setattr設置對象的屬性值

相關文章：

1. jsp+mysql實現網頁的分頁查詢2. 關于HTML的語義化標簽和無語義化標簽3. Java 抽象類與接口的對比4. Spring中基于xml的AOP的詳細步驟5. 最新IDEA快速實現Docker鏡像部署運行的過程詳解6. python os.listdir()亂碼解決方案7. 利用CSS3新特性創建透明邊框三角8. python 實用工具狀態機transitions9. HTML中link標簽屬性詳解10. div的offsetLeft與style.left區別

国产成人精品亚洲777人妖,欧美日韩精品一区视频,最新亚洲国产,国产乱码精品一区二区亚洲

python爬取”頂點小說網“《純陽劍尊》的示例代碼