NumPy Sorting and Conditional Selection Functions
NumPy provides various sorting methods. These sorting functions implement different sorting algorithms, each characterized by execution speed, worst-case performance, required workspace, and algorithm stability. The table below compares three sorting algorithms.
Type | Speed | Worst Case | Workspace | Stability |
---|---|---|---|---|
'quicksort' | 1 | O(n^2) | 0 | No |
'mergesort' | 2 | O(n*log(n)) | ~n/2 | Yes |
'heapsort' | 3 | O(n*log(n)) | 0 | No |
numpy.sort()
The numpy.sort() function returns a sorted copy of the input array. The function format is as follows:
numpy.sort(a, axis, kind, order)
Parameter descriptions:
- a: The array to be sorted
- axis: The axis along which the array is sorted. If not specified, the array is flattened, sorting along the last axis. axis=0 sorts by column, axis=1 sorts by row.
- kind: Default is 'quicksort'
- order: If the array contains fields, the field to sort by
Example
import numpy as np
a = np.array([[3,7],[9,1]])
print ('Our array is:')
print (a)
print ('\n')
print ('Calling sort() function:')
print (np.sort(a))
print ('\n')
print ('Sorting by column:')
print (np.sort(a, axis = 0))
print ('\n')
# Sorting fields in the sort function
dt = np.dtype([('name', 'S10'),('age', int)])
a = np.array([("raju",21),("anil",25),("ravi", 17), ("amar",27)], dtype = dt)
print ('Our array is:')
print (a)
print ('\n')
print ('Sorting by name:')
print (np.sort(a, order = 'name'))
Output:
Our array is:
[[3 7]
[9 1]]
Calling sort() function:
[[3 7]
[1 9]]
Sorting by column:
[[3 1]
[9 7]]
Our array is:
[(b'raju', 21) (b'anil', 25) (b'ravi', 17) (b'amar', 27)]
Sorting by name:
[(b'amar', 27) (b'anil', 25) (b'raju', 21) (b'ravi', 17)]
numpy.argsort()
The numpy.argsort() function returns the indices that would sort an array.
Example
import numpy as np
x = np.array([3, 1, 2])
print ('Our array is:')
print (x)
print ('\n')
print ('Calling argsort() function on x:')
y = np.argsort(x)
print (y)
print ('\n')
print ('Reconstructing original array in sorted order:')
print (x[y])
print ('\n')
print ('Reconstructing original array using a loop:')
for i in y:
print (x[i], end=" ")
Output:
Our array is:
[3 1 2]
Calling argsort() function on x:
[1 2 0]
Reconstructing original array in sorted order:
[1 2 3]
Reconstructing original array using a loop:
1 2 3
numpy.lexsort()
numpy.lexsort() is used for sorting multiple sequences. Think of it as sorting a spreadsheet where each column represents a sequence, with priority given to the later columns.
An application scenario is during elementary to middle school exams, where students are admitted to a key class based on total scores. If total scores are tied, higher math scores are prioritized, and if both total and math scores are tied, English scores are considered. Here, total scores are in the last column of the spreadsheet, math scores in the second last, and English scores in the third last.
Example
import numpy as np
nm = ('raju','anil','ravi','amar')
dv = ('f.y.', 's.y.', 's.y.', 'f.y.')
ind = np.lexsort((dv,nm))
print ('Calling lexsort() function:')
print (ind)
print ('\n')
print ('Using this index to obtain sorted data:')
print ([nm[i] + ", " + dv[i] for i in ind])
Output:
Calling lexsort() function:
[3 1 0 2]
Using this index to obtain sorted data:
['amar, f.y.', 'anil, s.y.', 'raju, f.y.', 'ravi, s.y.']
print([nm[i] + ", " + dv[i] for i in ind])
Output result:
Calling lexsort() function:
[3 1 0 2]
Using this index to obtain sorted data:
['amar, f.y.', 'anil, s.y.', 'raju, f.y.', 'ravi, s.y.']
The tuple passed to np.lexsort above first sorts by nm in the order: amar, anil, raju, ravi. Therefore, the sorting result is [3 1 0 2].
msort, sort_complex, partition, argpartition
| msort(a) | Sorts the array along the first axis and returns a sorted array copy. np.msort(a) is equivalent to np.sort(a, axis=0). | | sort_complex(a) | Sorts complex numbers first by real part and then by imaginary part. | | partition(a, kth[, axis, kind, order]) | Partitions the array by specifying a number. | | argpartition(a, kth[, axis, kind, order]) | Partitions the array along the specified axis using the algorithm specified by the kind keyword. |
Complex number sorting:
>>> import numpy as np
>>> np.sort_complex([5, 3, 6, 2, 1])
array([ 1.+0.j, 2.+0.j, 3.+0.j, 5.+0.j, 6.+0.j])
>>>
>>> np.sort_complex([1 + 2j, 2 - 1j, 3 - 2j, 3 - 3j, 3 + 5j])
array([ 1.+2.j, 2.-1.j, 3.-3.j, 3.-2.j, 3.+5.j])
partition() partition sort:
>>> a = np.array([3, 4, 2, 1])
>>> np.partition(a, 3) # Sorts all elements (including duplicates) in array a from smallest to largest, 3 indicates the index of the number to sort, smaller numbers are placed before this number, and larger numbers are placed after it
array([2, 1, 3, 4])
>>>
>>> np.partition(a, (1, 3)) # Numbers less than 1 are at the front, numbers greater than 3 are at the back, and numbers between 1 and 3 are in the middle
array([1, 2, 3, 4])
Find the 3rd smallest (index=2) and 2nd largest (index=-2) values in the array:
>>> arr = np.array([46, 57, 23, 39, 1, 10, 0, 120])
>>> arr[np.argpartition(arr, 2)[2]]
10
>>> arr[np.argpartition(arr, -2)[-2]]
57
Find both the 3rd and 4th smallest values simultaneously. Note here that using [2,3] simultaneously sorts the 3rd and 4th smallest values, and then they can be obtained separately by indices [2] and [3].
>>> arr[np.argpartition(arr, [2,3])[2]]
10
>>> arr[np.argpartition(arr, [2,3])[3]]
23
numpy.argmax() and numpy.argmin()
The numpy.argmax() and numpy.argmin() functions return the indices of the maximum and minimum elements along a given axis, respectively.
Example
import numpy as np
a = np.array([[30,40,70],[80,20,10],[50,90,60]])
print ('Our array is:')
print (a)
print ('\n')
print ('Calling argmax() function:')
print (np.argmax(a))
print ('\n')
print ('Flattened array:')
print (a.flatten())
print ('\n')
print ('Indices of maximum value along axis 0:')
maxindex = np.argmax(a, axis = 0)
print (maxindex)
print ('\n')
print ('Indices of maximum value along axis 1:')
maxindex = np.argmax(a, axis = 1)
print (maxindex)
print ('\n')
print ('Calling argmin() function:')
minindex = np.argmin(a)
print (minindex)
print ('\n')
print ('Minimum value in flattened array:')
print (a.flatten()[minindex])
print ('\n')
Print ('Index of minimum along axis 0:')
minindex = np.argmin(a, axis=0)
print(minindex)
print('\n')
print('Index of minimum along axis 1:')
minindex = np.argmin(a, axis=1)
print(minindex)
Output result:
Our array is:
[[30 40 70]
[80 20 10]
[50 90 60]]
Calling argmax() function:
7
Flattened array:
[30 40 70 80 20 10 50 90 60]
Maximum index along axis 0:
[1 2 0]
Maximum index along axis 1:
[2 0 1]
Calling argmin() function:
5
Minimum value in the flattened array:
10
Minimum index along axis 0:
[0 1 1]
Minimum index along axis 1:
[0 2 0]
numpy.nonzero()
The numpy.nonzero() function returns the indices of the non-zero elements in the input array.
Example
import numpy as np
a = np.array([[30,40,0],[0,20,10],[50,0,60]])
print('Our array is:')
print(a)
print('\n')
print('Calling nonzero() function:')
print(np.nonzero(a))
Output result:
Our array is:
[[30 40 0]
[ 0 20 10]
[50 0 60]]
Calling nonzero() function:
(array([0, 0, 1, 1, 2, 2]), array([0, 1, 1, 2, 0, 2]))
numpy.where()
The numpy.where() function returns the indices of elements in the input array that satisfy the given condition.
Example
import numpy as np
x = np.arange(9.).reshape(3, 3)
print('Our array is:')
print(x)
print('Indices of elements greater than 3:')
y = np.where(x > 3)
print(y)
print('Using these indices to get elements that satisfy the condition:')
print(x[y])
Output result:
Our array is:
[[0. 1. 2.]
[3. 4. 5.]
[6. 7. 8.]]
Indices of elements greater than 3:
(array([1, 1, 2, 2, 2]), array([1, 2, 0, 1, 2]))
Using these indices to get elements that satisfy the condition:
[4. 5. 6. 7. 8.]
numpy.extract()
The numpy.extract() function extracts elements from an array based on a condition and returns the elements that satisfy the condition.
Example
import numpy as np
x = np.arange(9.).reshape(3, 3)
print('Our array is:')
print(x)
# Define condition, select even elements
condition = np.mod(x, 2) == 0
print('Element-wise condition values:')
print(condition)
print('Extracting elements using the condition:')
print(np.extract(condition, x))
Output result:
Our array is:
[[0. 1. 2.]
[3. 4. 5.]
[6. 7. 8.]]
Element-wise condition values:
[[ True False True]
[False True False]
[ True False True]]
Extracting elements using the condition:
[0. 2. 4. 6. 8.]