NumPy Data Types
NumPy supports a much greater variety of numerical types than Python does. It essentially corresponds to the C language's data types, with some types corresponding to Python's built-in types. The table below lists common basic NumPy types.
Name | Description |
---|---|
bool_ | Boolean data type (True or False) |
int_ | Default integer type (similar to C's long, int32 or int64) |
intc | Identical to C's int, typically int32 or int64 |
intp | Integer type used for indexing (similar to C's ssize_t, usually int32 or int64) |
int8 | Byte (-128 to 127) |
int16 | Integer (-32768 to 32767) |
int32 | Integer (-2147483648 to 2147483647) |
int64 | Integer (-9223372036854775808 to 9223372036854775807) |
uint8 | Unsigned integer (0 to 255) |
uint16 | Unsigned integer (0 to 65535) |
uint32 | Unsigned integer (0 to 4294967295) |
uint64 | Unsigned integer (0 to 18446744073709551615) |
float_ | Shorthand for float64 |
float16 | Half precision float: sign bit, 5 bits exponent, 10 bits mantissa |
float32 | Single precision float: sign bit, 8 bits exponent, 23 bits mantissa |
float64 | Double precision float: sign bit, 11 bits exponent, 52 bits mantissa |
complex_ | Shorthand for complex128, i.e., 128-bit complex number |
complex64 | Complex number, represented by two 32-bit floats (real and imaginary parts) |
complex128 | Complex number, represented by two 64-bit floats (real and imaginary parts) |
NumPy's numeric types are instances of dtype objects, which are uniquely identified by a character, including np.bool_, np.int32, np.float32, etc.
Data Type Object (dtype)
A data type object (an instance of numpy.dtype class) describes how the bytes in the fixed-size block of memory corresponding to an array item should be interpreted. It describes the following aspects of the data:
- Type of the data (integer, float, or Python object)
- Size of the data (e.g., how many bytes are used to store an integer)
- Byte order of the data (little-endian or big-endian)
- In the case of structured types, the names of the fields, the data type of each field, and part of the memory block each field takes
- If the data type is a sub-array, its shape and data type
Byte order is determined by prefixing the data type with '<' or '>'. '<' means little-endian (least significant byte stored first). '>' means big-endian (most significant byte stored first).
A dtype object is constructed using the following syntax:
numpy.dtype(object, align, copy)
- object - The object to convert to a data type object
- align - If true, pads the fields to make it similar to C structs
- copy - Creates a copy of the dtype object. If false, it's a reference to the built-in data type object
Example
Let's understand through examples.
Example 1
import numpy as np
# Using scalar type
dt = np.dtype(np.int32)
print(dt)
Output:
int32
Example 2
import numpy as np
# int8, int16, int32, int64 can be replaced by equivalent string 'i1', 'i2', 'i4', 'i8'
dt = np.dtype('i4')
print(dt)
Output:
int32
Example 3
import numpy as np
# Byte order mark
dt = np.dtype('<i4')
print(dt)
Output:
int32
The following example demonstrates the use of a structured data type, where the type fields and corresponding actual types are created.
Example 4
# First create a structured data type
import numpy as np
dt = np.dtype([('age',np.int8)])
print(dt)
Output:
[('age', 'i1')]
Example 5
# Apply the data type to an ndarray object
import numpy as np
dt = np.dtype([('age',np.int8)])
a = np.array([(10,),(20,),(30,)], dtype = dt)
print(a)
Output:
[(10,) (20,) (30,)]
dt = np.dtype([('age', np.int8)])
a = np.array([(10,), (20,), (30,)], dtype=dt)
print(a)
Output:
[(10,) (20,) (30,)]
Example 6
# Field names can be used to access the actual age column
import numpy as np
dt = np.dtype([('age', np.int8)])
a = np.array([(10,), (20,), (30,)], dtype=dt)
print(a['age'])
Output:
[10 20 30]
The following example defines a structured data type called student, which includes a string field 'name', an integer field 'age', and a float field 'marks', and applies this dtype to an ndarray object.
Example 7
import numpy as np
student = np.dtype([('name', 'S20'), ('age', 'i1'), ('marks', 'f4')])
print(student)
Output:
[('name', 'S20'), ('age', 'i1'), ('marks', 'f4')]
Example 8
import numpy as np
student = np.dtype([('name', 'S20'), ('age', 'i1'), ('marks', 'f4')])
a = np.array([('abc', 21, 50), ('xyz', 18, 75)], dtype=student)
print(a)
Output:
[('abc', 21, 50.0), ('xyz', 18, 75.0)]
Each built-in type has a unique character code that defines it, as follows:
Character | Corresponding Type |
---|---|
b | Boolean |
i | (Signed) Integer |
u | Unsigned integer |
f | Floating point |
c | Complex floating point |
m | Timedelta |
M | Datetime |
O | (Python) Object |
S, a | (Byte-)String |
U | Unicode |
V | Raw data (void) |