Python中多線程和多處理的初學者指南頭條網

Python中多線程和多處理的初學者指南

2020-12-27 15:12:36 佚名

前言

使用Python分析數據，如果使用了正確的數據結構和算法，有時可以大量提高程序的速度。實現此目的的一種方法是使用Muiltithreading（多線程）或Multiprocessing（多重處理）。

在這篇文章中，我們不會詳細討論多線程或多處理的內部原理。相反，我們舉一個例子，編寫一個小的Python腳本從Unsplash下載圖像。我們將從一次下載一個圖像的版本開始。接下來，我們使用線程來提高執行速度。

多線程

簡單地說，線程允許您並行地運行程序。花費大量時間等待外部事件的任務通常適合線程化。它們也稱為I/O Bound任務例如從文件中讀寫，網絡操作或使用API在線下載。讓我們來看一個示例，它展示了使用線程的好處。

沒有線程

在本例中，我們希望通過順序運行程序來查看從Unsplash API下載15張圖像需要多長時間:

<code>import requests
import time
img_urls = [
    'https: 
    'https: 
    'https: 
    'https: 
    'https: 
    'https: 
    'https: 
    'https: 
    'https: 
    'https: 
    'https 
: 
    'https: 
    'https: 
    'https: 
    'https: 
]

start = time.perf_counter() #start timer
for
 img_url in img_urls:
    img_name = img_url.split('/'
)[3] #get image name from url
    img_bytes = requests.get(img_url).content
with open(img_name, 'wb
') as img_file:
     img_file.write(img_bytes) #save image to disk 

finish = time.perf_counter() #end timer
print(f"Finished in {round(finish-start,2)} seconds") 

#results
Finished in
 23.101926751
 seconds/<code>

一共用時23秒。

多線程

讓我們看看Pyhton中的線程模塊如何顯著地改進我們的程序執行：

<code>import time
from
 concurrent.futures  
import ThreadPoolExecutor

def
 download_images
(url)
:
    img_name = img_url.split('/'
)[3]
    img_bytes = requests.get(img_url).content
    with
 open(img_name, 'wb'
) as img_file:
         img_file.write(img_bytes)
         print(f"
{img_name}
 was downloaded")

start = time.perf_counter()  
with
 ThreadPoolExecutor() as executor:
    results = executor.map(download_images,img_urls)  
finish = time.perf_counter()  
print(f"Finished in 
{round(finish-start,
2
)}
 seconds")

 
Finished in
 5.544147536
 seconds/<code>