Understanding the "self" Argument in Python: A Deep Dive
Written on
Chapter 1: The Essence of "self" in Python
Every Python developer is acquainted with the "self" parameter, a staple in the method definitions of all classes. While many know how to utilize it, few grasp its true nature, purpose, and the underlying mechanics.
What We Already Understand
To begin, let's clarify what we know: "self," the first parameter in class methods, signifies the instance of the class. Notably, it's merely a convention to name this parameter "self"—in other programming languages, alternatives like "this" are more common (but it's advisable to stick with "self" in Python).
The code snippet below illustrates this concept:
class MyClass:
def do_stuff(self, some_arg):
pass
In this instance, although do_stuff receives only one argument (some_arg), the method definition includes two (self and some_arg). This discrepancy raises questions—how does "self" get populated?
Internally, Python transforms the call from instance.do_stuff("whatever") to MyClass.do_stuff(instance, "whatever"). While one might label this as "Python magic," to truly comprehend the mechanics, we must delve deeper into the relationship between methods and functions.
Class Attributes and Methods
In Python, methods aren't distinct objects; they're essentially regular functions defined within the namespace of a class. The primary distinction lies in their classification as attributes of the class.
These attributes reside in the class dictionary, __dict__, which can be accessed directly or via the built-in vars function. Here's a demonstration:
class MyClass:
def do_stuff(self):
return "Doing stuff"
print(MyClass.do_stuff) # Output: <function MyClass.do_stuff at ...>
Typically, we access these via class attributes, as shown. When accessed through an instance, however, we retrieve a "bound method." Python effectively binds the class attribute to the instance, creating a "bound method" that includes the instance as the first argument (self).
Thus, methods are plain functions with the instance (self) prepended to their parameters.
Understanding the Descriptor Protocol
To comprehend how "self" operates, we must examine the descriptor protocol. Descriptors are classes that define the methods __get__(), __set__(), and __delete__(). For our purpose, we will focus on __get__():
class MyDescriptor:
def __get__(self, instance, owner):
...
The __get__() method customizes how an attribute is accessed within classes. Since methods are class attributes, we can use __get__() to generate a "bound method."
Let's create a descriptor to illustrate this concept. Below is a simplified implementation of a function object:
class Function:
def __get__(self, instance, owner):
if instance is None:
return selfreturn types.MethodType(self, instance)
In this implementation, if instance is None, we return self. If it's not, we return types.MethodType, which creates a "bound method" manually. The __call__ method is also included for callable instances.
With this, we can bind a method to a class as follows:
class MyClass:
do_stuff = Function()
When we access instance.do_stuff, it looks up do_stuff in the instance's attribute dictionary (__dict__). If it has a __get__ method, it invokes it, returning a bound method with self as the first argument.
For further exploration, you can implement static and class methods, with examples found in the documentation.
Why Is "self" Necessary?
While we understand the mechanics, a philosophical question remains: Why must "self" be present in method definitions?
The explicit inclusion of "self" is a design choice that prioritizes simplicity, aligning with Python's "worse is better" philosophy. This principle emphasizes the need for straightforward design in both implementation and interface, where simplicity in implementation takes precedence.
The rationale for retaining "self" is detailed in Guido van Rossum's blog post, where he addressed proposals for its removal.
Closing Thoughts
While Python simplifies many complexities, diving into its foundational aspects can yield valuable insights, particularly in troubleshooting and debugging. Understanding descriptors can also be beneficial, as they have practical applications, especially with custom validators in frameworks like SQLAlchemy.
Want to Connect?
This article was originally posted at martinheinz.dev.
Chapter 2: Additional Resources
Explore the fundamentals of constructors in Python with this comprehensive tutorial for beginners.
Discover a straightforward guide to utilizing command line arguments in Python effectively.