The 5 Meanings of Underscores in Python

❮ Programmer Joke 19 Android Tutorial Date Time 2 ❯

The 5 Meanings of Underscores in Python

Category Programming Techniques

Single Leading Underscore: _var
Single Trailing Underscore: var_
Double Leading Underscore: __var
Double Leading and Trailing Underscore: __var__
Single Underscore: _

At the end of the article, you can find a brief "cheat sheet" summarizing the five different underscore naming conventions and their meanings, along with a short video tutorial that allows you to experience their behavior firsthand.

Let's get started right away!

1. Single Leading Underscore _var

When it comes to variable and method names, a single leading underscore prefix has a conventional meaning. It serves as a hint for the programmer - indicating what the Python community agrees it should mean, but it does not affect the program's behavior.

The meaning of the underscore prefix is to inform other programmers that a variable or method starting with a single underscore is intended for internal use. This convention is defined in PEP 8.

This is not enforced by Python. Unlike Java, Python does not have a strong distinction between "private" and "public" variables. It's like someone put up a small underscore warning sign, saying:

"Hey, this isn't really meant to be part of the public interface of the class. Just leave it alone."

Take a look at the following example:

class Test:
   def __init__(self):
       self.foo = 11
       self._bar = 23

What happens if you instantiate this class and try to access the foo and _bar attributes defined in the __init__ constructor? Let's see:

>>> t = Test()
>>> t.foo
11
>>> t._bar
23

You'll see that the single underscore in _bar did not prevent us from "getting into" the class and accessing the value of that variable.

This is because the single underscore prefix in Python is just a convention - at least as far as variable and method names are concerned.

However, leading underscores do affect the way names are imported from modules.

Suppose you have the following code in a module named my_module:

# This is my_module.py:

def external_func():
   return 23

def _internal_func():
   return 42

Now, if you use wildcard imports to import all names from the module, Python will not import names with a leading underscore (unless the module defines an __all__ list that overrides this behavior):

>>> from my_module import *
>>> external_func()
23
>>> _internal_func()
NameError: "name '_internal_func' is not defined"

By the way, wildcard imports should be avoided as they make it unclear which names are present in the namespace. For clarity, regular imports are better.

Unlike wildcard imports, regular imports are not affected by the leading single underscore naming convention:

>>> import my_module
>>> my_module.external_func()
23
>>> my_module._internal_func()
42

I know this might be a bit confusing. If you follow PEP 8 recommendations and avoid wildcard imports, then all you really need to remember is this:

A single underscore is a Python naming convention indicating that a name is meant for internal use. It is generally not enforced by the Python interpreter and is only a hint to the programmer.

2. Single Trailing Underscore var_

Sometimes, the most appropriate name for a variable is already taken by a keyword. Therefore, names like class or def cannot be used as variable names in Python. In such cases, you can append an underscore to resolve the naming conflict:

>>> def make_object(name, class):
SyntaxError: "invalid syntax"

>>> def make_object(name, class_):
...    pass

In summary, a single trailing underscore (suffix) is a convention to avoid naming conflicts with Python keywords. PEP 8 explains this convention.

3. Double Leading Underscore __var

So far, all the naming patterns we've covered have meanings that are based on agreed-upon conventions. For Python class attributes (including variables and methods) that start with a double underscore, things are a bit different.

A double underscore prefix causes the Python interpreter to rewrite the attribute name to avoid naming conflicts in subclasses.

This is also known as name mangling - the interpreter changes the name of the variable to make it harder to create accidental overrides when the class is extended.

I know this sounds abstract. So, I've put together a small code example to illustrate:

class Test:
   def __init__(self):
       self.foo = 11
       self._bar = 23
       self.__baz = 23

Let's use the built-in dir() function to look at the attributes of this object:

>>> t = Test()
>>> dir(t)
['_Test__baz', '__class__', '__delattr__', '__dict__', '__dir__',
'__doc__', '__eq__', '__format__', '__ge__', '__getattribute__',
'__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__',
'__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
'__setattr__', '__sizeof__', '__str__', '__subclasshook__',
'__weakref__', '_bar', 'foo']

This is the list of attributes for this object. Let's look through this list and search for our original variable names foo, _bar, and __baz - I promise you'll notice some interesting changes.

The self.foo variable appears in the attribute list as foo without modification.

The behavior of self._bar is the same - it appears on the class as _bar. As I mentioned earlier, the leading underscore is just a convention. A hint to the programmer.

However, for self.__baz, the situation looks a bit different. When you search for __baz in the list, you won't find a variable with that name.

What happened to __baz?

If you look closely, you'll see that there's an attribute on this object called _Test__baz. This is the name mangling that the Python interpreter performs. It does this to prevent accidental overrides of variables in subclasses.

Let's create another class that extends Test and try to override the existing attributes added in the constructor:

class ExtendedTest(Test):
   def __init__(self):
       super().__init__()
       self.foo = 'overridden'
       self._bar = 'overridden'
       self.__baz = 'overridden'

Now, what do you think the values of foo, _bar, and __baz will be on an instance of the ExtendedTest class? Let's see:

>>> t2 = ExtendedTest()
>>> t2.foo
'overridden'
>>> t2._bar
'overridden'
>>> t2.__baz
AttributeError: "'ExtendedTest' object has no attribute '__baz'"

Wait a minute, why do we get an AttributeError when we try to check the value of t2.__baz? Name mangling is triggered again! It turns out that this object doesn't even have a __baz attribute:

>>> dir(t2)
['_ExtendedTest__baz', '_Test__baz', '__class__', '__delattr__',
'__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__',
'__getattribute__', '__gt__', '__hash__', '__init__', '__le__',
'__lt__', '__module__', '__ne__', '__new__', '__reduce__',
'__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__',
'__subclasshook__', '__weakref__', '_bar', 'foo', 'get_vars']

As you can see, __baz has become _ExtendedTest__baz to prevent accidental modification:

>>> t2._ExtendedTest__baz
'overridden'

But the original _Test__baz is still there:

>>> t2._Test__baz
42

The double underscore name mangling is completely transparent to the programmer. The following example confirms this:

class ManglingTest:
   def __init__(self):
       self.__mangled = 'hello'

   def get_mangled(self):
       return self.__mangled

>>> ManglingTest().get_mangled()
'hello'
>>> ManglingTest().__mangled
AttributeError: "'ManglingTest' object has no attribute '__mangled'"

Does name mangling also apply to method names? Yes, it does. Name mangling affects all names that start with two underscore characters ("dunders") in the context of a class:

class MangledMethod:
   def __method(self):
       return 42

   def call_it(self):
       return self.__method()

>>> MangledMethod().__method()
AttributeError: "'MangledMethod' object has no attribute '__method'"
>>> MangledMethod().call_it()
42

Here's another example that might surprise you:

_MangledGlobal__mangled = 23

class MangledGlobal:
   def test(self):
       return __mangled

>>> MangledGlobal().test()
23

In this example, I declare a global variable named _MangledGlobal__mangled. Then I access the variable within the context of a class named MangledGlobal. Due to name mangling, I can reference the global variable _MangledGlobal__mangled as __mangled within the test() method of the class.

The Python interpreter automatically expands the name __mangled to _MangledGlobal__mangled because it starts with two underscore characters. This shows that name mangling is not specifically tied to class attributes. It applies to any name that starts with two underscore characters used in the context of a class.

That's a lot to take in.

Honestly, these examples and explanations didn't just come to me out of thin air. I did some research and processing to come up with them. I've been using Python for many years, but rules and special cases like these don't always come to mind.

Sometimes the most important skill for a programmer is "pattern recognition," and knowing where to look up information. If you feel a bit overwhelmed at this point, don't worry. Take your time, try out some of the examples in this article.

Let these concepts sink in so you can understand the overall idea of name mangling and some of the other behaviors I've shown you. If you ever encounter them, you'll know what to look up in the documentation.

4. Double Leading and Double Trailing Underscore var

This section is intentionally left blank as the original text does not provide content for this heading. Perhaps surprisingly, if a name starts and ends with double underscores, name mangling is not applied. Variables surrounded by double underscore prefixes and suffixes are not modified by the Python interpreter:

class PrefixPostfixTest:
   def __init__(self):
       self.__bam__ = 42

>>> PrefixPostfixTest().__bam__
42

However, Python reserves names with both leading and trailing double underscores for special purposes. Examples include __init__ for object constructors, or __call__ which allows an object to be called.

These "dunder" methods are often referred to as magic methods, but many in the Python community (including myself) dislike this approach.

It's best to avoid using names that start and end with double underscores ("dunders") in your own programs to prevent conflicts with future changes in the Python language.

5. Single Underscore _

By convention, a single standalone underscore is sometimes used as a name to indicate that a variable is temporary or insignificant.

For example, in the following loop, we don't need to access the running index, so we can use "_" to indicate it's just a temporary value:

>>> for _ in range(32):
...    print('Hello, World.')

You can also use a single underscore in unpacking expressions to ignore specific values. Again, this meaning is just "by convention" and does not trigger any special behavior in the Python interpreter. A single underscore is simply a valid variable name with this usage.

In the following code example, I unpack the car tuple into separate variables, but I'm only interested in the color and mileage values. However, to make the unpacking expression work, I need to assign all values in the tuple to variables. In this case, "_" as a placeholder variable can be useful:

>>> car = ('red', 'auto', 12, 3812.4)
>>> color, _, _, mileage = car

>>> color
'red'
>>> mileage
3812.4
>>> _
12

In addition to being used as a temporary variable, "_" is a special variable in most Python REPLs that represents the result of the most recent expression evaluated by the interpreter.

This is convenient, for example, you can access the result of a previous calculation in an interpreter session, or you are dynamically building multiple objects and interacting with them without needing to assign names to these objects beforehand:

>>> 20 + 3
23
>>> _
23
>>> print(_)
23

>>> list()
[]
>>> _.append(1)
>>> _.append(2)
>>> _.append(3)
>>> _
[1, 2, 3]

Python Underscore Naming Patterns - Summary

Here's a quick summary, or "cheat sheet," listing the five Python underscore patterns discussed in this article:

Original Article: https://zhuanlan.zhihu.com/p/36173202

** Click to Share Notes

Cancel

❮ Programmer Joke 19 Android Tutorial Date Time 2 ❯