Python3 Multithreading
Multithreading is similar to executing multiple different programs concurrently. Multithreading has the following advantages:
Threads can be used to move tasks that take a long time to execute in a program to the background for processing.
The user interface can be more appealing. For example, when a user clicks a button to trigger some event processing, a progress bar can be displayed to show the progress of the processing.
The execution speed of the program may be increased.
Threads are useful in some tasks that involve waiting, such as user input, file reading/writing, and network data reception/transmission. In such cases, we can release some precious resources like memory usage.
Each independent thread has an entry point for program execution, a sequence of execution, and an exit point for the program. However, threads cannot execute independently; they must exist within an application and be controlled by the application to provide multiple threads of execution.
Each thread has its own set of CPU registers, known as the thread's context. This context reflects the state of the CPU registers for the thread the last time it was executed.
The instruction pointer and stack pointer registers are the two most important registers in the thread context. Threads always run within the context of a process, and these addresses are used to mark the memory in the process address space that owns the thread.
Threads can be preempted (interrupted).
Threads can be temporarily suspended (also known as sleeping) when other threads are running – this is known as thread yielding.
Threads can be classified into:
Kernel Threads: Created and terminated by the operating system kernel.
User Threads: Implemented in user programs without the need for kernel support.
The two commonly used modules in Python3 for threading are:
_thread
threading (recommended)
The thread module has been deprecated. Users can use the threading module instead. Therefore, the "thread" module cannot be used in Python3. For compatibility, Python3 renamed thread to "_thread".
Getting Started with Python Threading
There are two ways to use threads in Python: by using functions or by wrapping thread objects in a class.
Function-style: Use the start_new_thread() function in the _thread module to generate a new thread. The syntax is as follows:
_thread.start_new_thread(function, args[, kwargs])
Parameter descriptions:
function - The thread function.
args - The arguments to pass to the thread function; it must be a tuple type.
kwargs - Optional parameters.
Example
#!/usr/bin/python3
import _thread
import time
# Define a function for the thread
def print_time(threadName, delay):
count = 0
while count < 5:
time.sleep(delay)
count += 1
print("%s: %s" % (threadName, time.ctime(time.time())))
# Create two threads
try:
_thread.start_new_thread(print_time, ("Thread-1", 2,))
_thread.start_new_thread(print_time, ("Thread-2", 4,))
except:
print("Error: Unable to start thread")
while 1:
pass
Executing the above program will produce output similar to the following:
Thread-1: Wed Jan 5 17:38:08 2022
Thread-2: Wed Jan 5 17:38:10 2022
Thread-1: Wed Jan 5 17:38:10 2022
Thread-1: Wed Jan 5 17:38:12 2022
Thread-2: Wed Jan 5 17:38:14 2022
Thread-1: Wed Jan 5 17:38:14 2022
Thread-1: Wed Jan 5 17:38:16 2022
Thread-2: Wed Jan 5 17:38:18 2022
Thread-2: Wed Jan 5 17:38:22 2022
Thread-2: Wed Jan 5 17:38:26 2022
You can press ctrl-c to exit the program.
Threading Module
Python3 provides support for threads through two standard libraries: _thread and threading.
_thread provides low-level, primitive threads and a simple lock, which is relatively limited compared to the functionality of the threading module.
The threading module includes all methods from the _thread module and provides additional methods:
threading.currentThread(): Returns the current thread variable.
threading.enumerate(): Returns a list containing all threads that are currently running. Running refers to threads that have been started but not yet terminated, excluding those that have not started or have already terminated.
- threading.activeCount(): Returns the number of running threads, which yields the same result as len(threading.enumerate()).
In addition to the methods, the threading module also provides the Thread class to handle threads. The Thread class offers the following methods:
run(): The method representing the thread's activity.
start(): Starts the thread's activity.
join([time]): Waits until the thread terminates. This blocks the calling thread until the thread whose join() method is called terminates—either normally or through an unhandled exception—or until an optional timeout occurs.
isAlive(): Returns whether the thread is active.
getName(): Returns the thread's name.
setName(): Sets the thread's name.
Creating Threads Using the threading Module
We can create a new subclass by directly inheriting from threading.Thread and instantiate it, then start the new thread by calling the start() method, which in turn calls the thread's run() method:
Example
#!/usr/bin/python3
import threading
import time
exitFlag = 0
class myThread (threading.Thread):
def __init__(self, threadID, name, delay):
threading.Thread.__init__(self)
self.threadID = threadID
self.name = name
self.delay = delay
def run(self):
print ("Starting thread: " + self.name)
print_time(self.name, self.delay, 5)
print ("Exiting thread: " + self.name)
def print_time(threadName, delay, counter):
while counter:
if exitFlag:
threadName.exit()
time.sleep(delay)
print ("%s: %s" % (threadName, time.ctime(time.time())))
counter -= 1
# Create new threads
thread1 = myThread(1, "Thread-1", 1)
thread2 = myThread(2, "Thread-2", 2)
# Start new threads
thread1.start()
thread2.start()
thread1.join()
thread2.join()
print ("Exiting main thread")
The above program produces the following output:
Starting thread: Thread-1
Starting thread: Thread-2
Thread-1: Wed Jan 5 17:34:54 2022
Thread-2: Wed Jan 5 17:34:55 2022
Thread-1: Wed Jan 5 17:34:55 2022
Thread-1: Wed Jan 5 17:34:56 2022
Thread-2: Wed Jan 5 17:34:57 2022
Thread-1: Wed Jan 5 17:34:57 2022
Thread-1: Wed Jan 5 17:34:58 2022
Exiting thread: Thread-1
Thread-2: Wed Jan 5 17:34:59 2022
Thread-2: Wed Jan 5 17:35:01 2022
Thread-2: Wed Jan 5 17:35:03 2022
Exiting thread: Thread-2
Exiting main thread
Thread Synchronization
If multiple threads modify a shared data, unpredictable results may occur. To ensure data correctness, synchronization of multiple threads is necessary.
The Thread object's Lock and RLock can achieve simple thread synchronization, both of which have acquire and release methods. Operations that need to be performed by only one thread at a time can be placed between the acquire and release methods.
The advantage of multithreading is the ability to run multiple tasks concurrently (at least it appears so). However, when threads share data, issues with data synchronization may arise.
Consider a scenario where all elements of a list are 0, and the "set" thread changes all elements to 1 from the end to the beginning, while the "print" thread reads and prints the list from the beginning.
In such a case, the "set" thread might start modifying the list while the "print" thread begins printing, resulting in output that is half 0 and half 1, which is data inconsistency. To avoid this, the concept of locks is introduced. Locks have two states—locked and unlocked. Whenever a thread, such as "set," wants to access shared data, it must first acquire the lock; if another thread, such as "print," has already acquired the lock, then the "set" thread must pause, which is known as synchronization blocking. Once the "print" thread has finished accessing the data and releases the lock, the "set" thread can then continue.
By handling it this way, when printing the list, it will either output all 0s or all 1s, avoiding the awkward situation of having half 0s and half 1s.
Example
#!/usr/bin/python3
import threading
import time
class myThread (threading.Thread):
def __init__(self, threadID, name, delay):
threading.Thread.__init__(self)
self.threadID = threadID
self.name = name
self.delay = delay
def run(self):
print("Starting thread: " + self.name)
# Acquire lock for thread synchronization
threadLock.acquire()
print_time(self.name, self.delay, 3)
# Release lock, enabling the next thread
threadLock.release()
def print_time(threadName, delay, counter):
while counter:
time.sleep(delay)
print("%s: %s" % (threadName, time.ctime(time.time())))
counter -= 1
threadLock = threading.Lock()
threads = []
# Create new threads
thread1 = myThread(1, "Thread-1", 1)
thread2 = myThread(2, "Thread-2", 2)
# Start new threads
thread1.start()
thread2.start()
# Add threads to the thread list
threads.append(thread1)
threads.append(thread2)
# Wait for all threads to complete
for t in threads:
t.join()
print("Exiting main thread")
Executing the above program, the output is:
Starting thread: Thread-1
Starting thread: Thread-2
Thread-1: Wed Jan 5 17:36:50 2022
Thread-1: Wed Jan 5 17:36:51 2022
Thread-1: Wed Jan 5 17:36:52 2022
Thread-2: Wed Jan 5 17:36:54 2022
Thread-2: Wed Jan 5 17:36:56 2022
Thread-2: Wed Jan 5 17:36:58 2022
Exiting main thread
Thread Priority Queue (Queue)
Python's Queue module provides synchronous, thread-safe queue classes, including FIFO (First In, First Out) Queue, LIFO (Last In, First Out) LifoQueue, and PriorityQueue.
These queues implement lock primitives and can be used directly in multi-threading environments. They can be used to synchronize threads.
Common methods in the Queue module:
- Queue.qsize() returns the size of the queue
- Queue.empty() returns True if the queue is empty, False otherwise
- Queue.full() returns True if the queue is full, False otherwise
- Queue.full corresponds to the size of maxsize
- Queue.get([block[, timeout]]) retrieves an item from the queue, timeout is the waiting time
- Queue.get_nowait() is equivalent to Queue.get(False)
- Queue.put(item) writes an item to the queue, timeout is the waiting time
- Queue.put_nowait(item) is equivalent to Queue.put(item, False)
- Queue.task_done() sends a signal after completing a task to the queue that the task is done
- Queue.join() actually means waiting until the queue is empty before performing other operations
Example
#!/usr/bin/python3
import queue
import threading
import time
exitFlag = 0
class myThread (threading.Thread):
def __init__(self, threadID, name, q):
threading.Thread.__init__(self)
self.threadID = threadID
self.name = name
self.q = q
def run(self):
print("Starting thread: " + self.name)
process_data(self.name, self.q)
print("Exiting thread: " + self.name)
def process_data(threadName, q):
while not exitFlag:
queueLock.acquire()
if not workQueue.empty():
data = q.get()
queueLock.release()
print("%s processing %s" % (threadName, data))
else:
queueLock.release()
time.sleep(1)
threadList = ["Thread-1", "Thread-2", "Thread-3"]
nameList = ["One", "Two", "Three", "Four", "Five"]
queueLock = threading.Lock()
workQueue = queue.Queue(10)
threads = []
threadID = 1
# Create new threads
for tName in threadList:
thread = myThread(threadID, tName, workQueue)
thread.start()
threads.append(thread)
threadID += 1
# Fill the queue
queueLock.acquire()
for word in nameList:
workQueue.put(word)
queueLock.release()
# Wait for the queue to empty
while not workQueue.empty():
pass
# Notify threads it's time to exit
exitFlag = 1
# Wait for all threads to complete
for t in threads:
t.join()
print("Exiting main thread")
Execution result of the above program:
Starting thread: Thread-1
Starting thread: Thread-2
Starting thread: Thread-3
Thread-3 processing One
Thread-1 processing Two
Thread-2 processing Three
Thread-3 processing Four
Thread-1 processing Five
Exiting thread: Thread-3
Exiting thread: Thread-2
Exiting thread: Thread-1
Exiting main thread