文章詳情頁

python獲取整個網頁源碼的方法

瀏覽：5日期：2022-07-15 13:20:58

1、Python中獲取整個頁面的代碼：

import requestsres = requests.get(’https://blog.csdn.net/yirexiao/article/details/79092355’)res.encoding = ’utf-8’print(res.text)

2、運行結果

實例擴展：

from bs4 import BeautifulSoupimport time,re,urllib2t=time.time()websiteurls={}def scanpage(url): websiteurl=url t=time.time() n=0 html=urllib2.urlopen(websiteurl).read() soup=BeautifulSoup(html) pageurls=[] Upageurls={} pageurls=soup.find_all('a',href=True) for links in pageurls: if websiteurl in links.get('href') and links.get('href') not in Upageurls and links.get('href') not in websiteurls: Upageurls[links.get('href')]=0 for links in Upageurls.keys(): try: urllib2.urlopen(links).getcode() except: print 'connect failed' else: t2=time.time() Upageurls[links]=urllib2.urlopen(links).getcode() print n, print links, print Upageurls[links] t1=time.time() print t1-t2 n+=1 print ('total is '+repr(n)+' links') print time.time()-tscanpage(http://news.163.com/)

到此這篇關于python獲取整個網頁源碼的方法的文章就介紹到這了,更多相關python如何獲取整個頁面內容請搜索好吧啦網以前的文章或繼續瀏覽下面的相關文章希望大家以后多多支持好吧啦網！

Python 編程

上一條：python爬蟲使用正則爬取網站的實現下一條：python線程里哪種模塊比較適合

相關文章：

1. ASP實現加法驗證碼2. ASP刪除img標簽的style屬性只保留src的正則函數3. javascript xml xsl取值及數據修改第1/2頁4. 怎樣才能用js生成xmldom對象，并且在firefox中也實現xml數據島？5. 小技巧處理div內容溢出6. JSP實現文件上傳功能7. JavaWeb Servlet中url-pattern的使用8. asp知識整理筆記4（問答模式）9. JSP+Servlet實現文件上傳到服務器功能10. jsp+servlet實現猜數字游戲

排行榜

					
					PHP單件模式和命令鏈模式的基礎知識
Python多線程操作之互斥鎖、遞歸鎖、信號量、事件實例詳解
Django如何實現防止XSS攻擊
解決python便攜版無法直接運行py文件的問題
php strncmp函數原型源碼分析
如何在jsp界面中插入圖片
部署vue+Springboot前后端分離項目的步驟實現
解決SpringBoot使用devtools導致的類型轉換異常問題
ASP刪除img標簽的style屬性只保留src的正則函數
asp知識整理筆記4（問答模式）
python實現udp傳輸圖片功能