Python3 Modules
In the previous chapters, we mostly programmed using the Python interpreter. If you exit and re-enter the Python interpreter, all the methods and variables you defined will be lost.
To address this, Python provides a way to store these definitions in a file, which can be used by some scripts or interactive interpreter instances. This file is called a module.
A module is a file containing all the functions and variables you have defined, with the suffix .py
. Modules can be imported by other programs to utilize the functions and other features defined in the module. This is also how you use the Python standard library.
Below is an example of using a module from the Python standard library.
Example (Python 3.0+)
#!/usr/bin/python3
# Filename: using_sys.py
import sys
print('Command line arguments are:')
for i in sys.argv:
print(i)
print('\n\nPython path is:', sys.path, '\n')
Execution results are as follows:
$ python using_sys.py argument1 argument2
Command line arguments are:
using_sys.py
argument1
argument2
Python path is: ['/root', '/usr/lib/python3.4', '/usr/lib/python3.4/plat-x86_64-linux-gnu', '/usr/lib/python3.4/lib-dynload', '/usr/local/lib/python3.4/dist-packages', '/usr/lib/python3/dist-packages']
import sys
introduces thesys.py
module from the Python standard library; this is the method to import a specific module.
sys.argv
is a list containing the command line arguments.
sys.path
contains a list of paths where the Python interpreter automatically searches for required modules.
The import Statement
To use a Python source file, simply execute an import
statement in another source file. The syntax is as follows:
import module1[, module2[,... moduleN]
When the interpreter encounters an import
statement, it will import the module if it is in the current search path.
The search path is a list of directories that the interpreter searches through first. To import the support
module, the command should be placed at the top of the script:
support.py File Code
#!/usr/bin/python3
# Filename: support.py
def print_func(par):
print("Hello :", par)
return
Importing the support
module in test.py
:
test.py File Code
#!/usr/bin/python3
# Filename: test.py
# Import module
import support
# Now you can call functions contained in the module
support.print_func("tutorialpro")
The above example outputs:
$ python3 test.py
Hello : tutorialpro
A module will only be imported once, regardless of how many times you execute import
. This prevents the imported module from being executed repeatedly.
When we use the import
statement, how does the Python interpreter find the corresponding file?
This involves Python's search path, which is a series of directory names. The Python interpreter searches these directories in order to find the module being imported.
This looks similar to environment variables. Indeed, you can also determine the search path by defining environment variables.
The search path is determined during the Python compilation or installation process, and installing new libraries should modify it. The search path is stored in the path
variable within the sys
module. A simple experiment can be done in the interactive interpreter by entering the following code:
>>> import sys
>>> sys.path
['', '/usr/lib/python3.4', '/usr/lib/python3.4/plat-x86_64-linux-gnu', '/usr/lib/python3.4/lib-dynload', '/usr/local/lib/python3.4/dist-packages', '/usr/lib/python3/dist-packages']
>>>
The output of sys.path
is a list, where the first item is the empty string ''
, representing the current directory (which would be clearer if printed from a script, indicating the directory from which the script is run). This is also the directory where the Python interpreter is executed (or the directory of the script being run).
Thus, if a file with the same name as the module you want to import exists in the current directory, as is the case with me, it will mask the module you intend to import.
Understanding the concept of the search path allows you to modify sys.path
in your script to import modules that are not in the search path.
Now, create a file named fibo.py
in the interpreter's current directory or in one of the directories listed in sys.path
, with the following code:
Example
# Fibonacci numbers module
def fib(n): # write Fibonacci series up to n
a, b = 0, 1
while b < n:
print(b, end=' ')
a, b = b, a+b
print()
def fib2(n): # return Fibonacci series up to n
result = []
a, b = 0, 1
while b < n:
result.append(b)
a, b = b, a+b
return result
Then enter the Python interpreter and import this module using the following command:
>>> import fibo
This does not enter the function names directly into the current symbol table; it only enters the module name fibo
there.
You can access the functions using the module name:
Example
>>> fibo.fib(1000)
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
>>> fibo.fib2(100)
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
>>> fibo.__name__
'fibo'
If you plan to use a function frequently, you can assign it to a local name:
>>> fib = fibo.fib
>>> fib(500)
1 1 2 3 5 8 13 21 34 55 89 144 233 377
from … import Statement
Python's from
statement allows you to import specific parts of a module into the current namespace, with the following syntax:
from modname import name1[, name2[, ... nameN]]
For example, to import the fib
function from the fibo
module, use the following statement:
>>> from fibo import fib, fib2
>>> fib(500)
1 1 2 3 5 8 13 21 34 55 89 144 233 377
This statement does not import the entire fibo
module into the current namespace; it only imports the fib
function from fibo
.
from … import * Statement
It is also possible to import all contents of a module into the current namespace using the following statement:
from modname import *
This provides a simple way to import all items from a module. However, this statement should not be overused.
Diving into Modules
Modules can include executable code in addition to method definitions. This code is typically used for initializing the module and is executed only the first time the module is imported.
Each module has its own private symbol table, which is used as the global symbol table by all functions within the module.
Thus, the author of a module can safely use global variables within the module without worrying about conflicts with other users' global variables.
On the other hand, when you know what you are doing, you can access functions within a module using the modname.itemname
notation.
Modules can import other modules. Using import
at the beginning of a module (or script, or elsewhere) to import another module is a common practice, though not mandatory. The name of the imported module is then placed in the symbol table of the current module.
There is another way to import, which directly imports names from a module into the current module. For example:
>>> from fibo import fib, fib2
>>> fib(500)
1 1 2 3 5 8 13 21 34 55 89 144 233 377
This method does not place the imported module's name in the current symbol table (so in this example, the name fibo
is not defined).
There is also a method to import all names from a module into the current module's symbol table:
>>> from fibo import *
>>> fib(500)
1 1 2 3 5 8 13 21 34 55 89 144 233 377
This will import all names except those beginning with a single underscore (_
). Most Python programmers do not use this method, as it can lead to the introduction of names that may override existing definitions.
__name__ Attribute
Every module has a special attribute called __name__
. When a module is run directly, __name__
is set to '__main__'
. This allows you to include executable statements in a module that only run when the module is executed as a script, not when it is imported.
When a module is imported for the first time by another program, its main program will run. If we want a certain block of code within the module not to execute when the module is imported, we can use the __name__
attribute to ensure that the block only executes when the module is run itself.
#!/usr/bin/python3
# Filename: using_name.py
if __name__ == '__main__':
print('The program is running by itself')
else:
print('I am from another module')
Output when run as follows:
$ python using_name.py
The program is running by itself
$ python
>>> import using_name
I am from another module
>>>
Note: Each module has a __name__
attribute, and when its value is '__main__'
, it indicates that the module is running by itself; otherwise, it is being imported.
Note: _name and main_ have double underscores, __
, without the space in between.
dir() Function
The built-in function dir()
can find all the names defined within a module, returning them as a list of strings:
>>> import fibo, sys
>>> dir(fibo)
['__name__', 'fib', 'fib2']
>>> dir(sys)
['__displayhook__', '__doc__', '__excepthook__', '__loader__', '__name__',
'__package__', '__stderr__', '__stdin__', '__stdout__',
'_clear_type_cache', '_current_frames', '_debugmallocstats', '_getframe',
'_home', '_mercurial', '_xoptions', 'abiflags', 'api_version', 'argv',
'base_exec_prefix', 'base_prefix', 'builtin_module_names', 'byteorder',
'call_tracing', 'callstats', 'copyright', 'displayhook',
'dont_write_bytecode', 'exc_info', 'excepthook', 'exec_prefix',
'executable', 'exit', 'flags', 'float_info', 'float_repr_style',
'getcheckinterval', 'getdefaultencoding', 'getdlopenflags',
'getfilesystemencoding', 'getobjects', 'getprofile', 'getrecursionlimit',
'getrefcount', 'getsizeof', 'getswitchinterval', 'gettotalrefcount',
'gettrace', 'hash_info', 'hexversion', 'implementation', 'int_info',
'intern', 'maxsize', 'maxunicode', 'meta_path', 'modules', 'path',
'path_hooks', 'path_importer_cache', 'platform', 'prefix', 'ps1',
'setcheckinterval', 'setdlopenflags', 'setprofile', 'setrecursionlimit',
'setswitchinterval', 'settrace', 'stderr', 'stdin', 'stdout',
'thread_info', 'version', 'version_info', 'warnoptions']
If no parameters are given, the dir()
function lists all names currently defined:
>>> a = [1, 2, 3, 4, 5]
>>> import fibo
>>> fib = fibo.fib
>>> dir() # Get a list of attributes defined in the current module
['__builtins__', '__name__', 'a', 'fib', 'fibo', 'sys']
>>> a = 5 # Create a new variable 'a'
>>> dir()
['__builtins__', '__doc__', '__name__', 'a', 'sys']
>>>
>>> del a # Delete the variable 'a'
>>>
>>> dir()
Standard Modules
Python comes with a standard library of modules, which will be introduced in the Python Library Reference (referred to as the "Library Reference" later). Some modules are built directly into the interpreter. Although they are not part of the core language features, they can be used efficiently, even for system-level calls. These components are configured differently depending on the operating system. For example, the `winreg` module is only available on Windows systems. It's worth noting that there is a special module called `sys`, which is built into every Python interpreter. Variables `sys.ps1` and `sys.ps2` define the primary and secondary prompts, respectively:>>> import sys
>>> sys.ps1
'>>> '
>>> sys.ps2
'... '
>>> sys.ps1 = 'C> '
C> print('tutorialpro!')
tutorialpro!
C>
Packages
Packages provide a way to organize Python module namespaces using "dotted module names."
For example, a module named A.B
represents a submodule named B
in a package named A
.
Just as you don't have to worry about global variables clashing between different modules, using dotted module names also avoids conflicts between module names in different libraries.
This way, different authors can provide modules like NumPy
or Python graphics libraries without naming conflicts.
Let's say you want to design a package for unified sound file and data processing.
There are many different audio file formats (mostly identified by their file extensions, such as .wav
, .aiff
, .au
), so you need a growing set of modules for converting between formats.
There are also many different operations you might want to perform on audio data (like mixing, adding echo, applying equalizers, creating artificial stereo effects), so you need an endless set of modules to handle these operations.
Here's a possible package structure (in a hierarchical file system):
sound/ Top-level package
__init__.py Initialize the sound package
formats/ Subpackage for file format conversion
__init__.py
wavread.py
wavwrite.py
aiffread.py
aiffwrite.py
auread.py
auwrite.py
...
effects/ Subpackage for sound effects
__init__.py
echo.py
surround.py
reverse.py
...
filters/ Subpackage for filters
__init__.py
equalizer.py
vocoder.py
karaoke.py
...
When importing a package, Python looks for subdirectories within the package in the directories listed in sys.path
.
A directory must contain a file named __init__.py
to be recognized as a package, primarily to avoid common names (like string
) from unintentionally affecting the search path for valid modules.
The simplest case is to place an empty __init__.py
file. This file can also contain initialization code or assign values to the __all__
variable (which will be introduced later).
Users can import specific modules from a package, for example:
import sound.effects.echo
This imports the submodule sound.effects.echo
. It must be accessed using its full name:
sound.effects.echo.echofilter(input, output, delay=0.7, atten=4)
Another way to import the submodule is:
from sound.effects import echo
This also imports the submodule echo
, and it can be used without the lengthy prefix:
echo.echofilter(input, output, delay=0.7, atten=4)
Another variation is to directly import a function or variable:
from sound.effects.echo import echofilter
This imports the echofilter
function directly, and it can be used as follows:
echofilter(input, output, delay=0.7, atten=4)
from sound.effects.echo import echofilter
Similarly, this method imports the submodule: echo and allows you to use its echofilter() function directly:
echofilter(input, output, delay=0.7, atten=4)
Note that when using the from package import item
format, the item can be either a submodule (or subpackage) of the package, or some other name defined in the package, such as a function, class, or variable.
The import statement first tries to treat the item as a name defined in the package; if that fails, it attempts to import it as a module. If it still cannot be found, an ImportError
exception is raised.
Conversely, if you use the format import item.subitem.subsubitem
, each item except the last must be a package, and the last item can be a module or a package, but not a class, function, or variable.
Importing * from a Package
What happens when we use from sound.effects import *
?
Python will traverse the file system to find all submodules in the package and import them one by one.
However, this method does not work well on Windows platforms because Windows is case-insensitive.
On Windows, we cannot be sure whether a file named ECHO.py should be imported as the module echo, Echo, or ECHO.
To address this issue, we need to provide an exact index of the package.
The import statement follows these rules: If the package's __init__.py
file defines a list variable named __all__
, then it will import all names listed in __all__
when using from package import *
.
As the author of the package, remember to update __all__
whenever you update the package.
The following example contains the following code in file:sounds/effects/__init__.py
:
__all__ = ["echo", "surround", "reverse"]
This means that when you use from sound.effects import *
, only these three submodules within the package will be imported.
If __all__
is not defined, then using from sound.effects import *
will not import any submodules from the package sound.effects. It will only import the package sound.effects and everything defined within it (possibly running initialization code defined in __init__.py
).
This will import all names defined in __init__.py
. It will not interfere with any modules explicitly imported before this statement. Consider this code:
import sound.effects.echo
import sound.effects.surround
from sound.effects import *
In this example, before the execution of from...import
, the echo and surround modules from the package sound.effects are already imported into the current namespace. (This is still true if __all__
is defined.)
We generally do not advocate using *
to import modules, as it often reduces code readability. However, it can save keystrokes, and some modules are designed to be imported only in this way.
Remember, using from Package import specific_submodule
is always correct. In fact, it is the recommended method unless the submodule you are importing might have the same name as a submodule from another package.
If the package is a subpackage within a larger structure (e.g., for the package sound in this example), and you want to import a sibling package, you must use an absolute path for the import. For example, if the module sound.filters.vocoder
needs to use the module echo
from the package sound.effects
, you should write from sound.effects import echo
.
from . import echo
from .. import formats
from ..filters import equalizer
Both implicit and explicit relative imports start from the current module. The main module's name is always "__main__"
, and a Python application's main module should always use absolute references.
Packages also provide an additional attribute __path__
. This is a list of directories that contain __init__.py
files serving the package, and it must be defined before any other __init__.py
files are executed. You can modify this variable to affect the modules and subpackages included in the package.
This feature is not commonly used, typically for extending modules within the package.