Python Dataclasses

Data classes are awesome, I first encountered them in Kotlin, but gradually the major languages have begun implementing them – in C# 9 and Java 15 they are known as records, although thanks to the excellent lombak you’ve been able to achieve a similar effect in Java for some time now.

Data classes are a succinct way to define value classes, relying upon the compiler to generate the boilerplate code. Ideally your data classes are immutable because their state determines their identity. However that’s not enforced, wisely offering the developer flexibility.

This post is a summary of mCoding’s video on YouTube, I highly recommend you take a look here.

In his video, he provides an example class definition:

class Comment:
    def __init__(self, id: int, text: str):
        self.__id: int = id
        self.__text: str = text

    @property
    def id(self):
        return self.__id

    @property
    def text(self):
        return self.__text

    def __repr__(self):
        return f'id={self.id}, text={self.text}'

    def __eq__(self, other):
        if other.__class__ is self.__class__:
            return (self.id, self.text) == (other.id, other.text)
        else:
            return NotImplemented

    def __ne__(self, other):
        result = self.__eq__(other)
        if result is NotImplemented:
            return NotImplemented
        else:
            return not result

    def __hash__(self):
        return hash((self.__class__, self.id, self.text))

    def __lt__(self, other):
        if other.__class__ is self.__class__:
            return (self.id, self.text) < (other.id, other.text)
        else:
            return NotImplemented

    def __le__(self, other):
        if other.__class__ is self.__class__:
            return (self.id, self.text) <= (other.id, other.text)
        else:
            return NotImplemented

    def __gt__(self, other):
        if other.__class__ is self.__class__:
            return (self.id, self.text) > (other.id, other.text)
        else:
            return NotImplemented

    def __ge__(self, other):
        if other.__class__ is self.__class__:
            return (self.id, self.text) >= (other.id, other.text)
        else:
            return NotImplemented


def main():
    comment = Comment(1, "I just subscribed")
    print(comment)


if __name__ == '__main__':
    main()

# outputs:

id=1, text=I just subscribed

That’s a lot of boilerplate code. Imagine if we needed to add a new property, author? We would have to rework every method. Here’s the same class implemented as a dataclass:

from dataclasses import dataclass

@dataclass(frozen=True, order=True)
class Comment:
    id: int
    text: str

Let’s add an extra field, a list of replies, and run our example:

from dataclasses import dataclass, field


@dataclass(frozen=True, order=True)
class Comment:
    id: int
    text: str
    replies: list = field(default_factory=list, compare=False, hash=False, repr=False)


def main():
    comment = Comment(1, "I just subscribed")
    print(comment)


if __name__ == '__main__':
    main()

# outputs:

Comment(id=1, text='I just subscribed')

In the @dataclass decorator, the frozen=True declares this class to be immutable, and order=True instructs the decorator to generate equality methods to participate in sorting operations. For the new field replies, we declare that this field does not participate in comparison operations, hashing, nor string representation.

Using inspect, we can see the methods which have been automatically generated for us:

pprint(inspect.getmembers(Comment, inspect.isfunction))

# outputs:

[('__delattr__',
  <function __create_fn__.<locals>.__delattr__ at 0x7ffb961a5e50>),
 ('__eq__', <function __create_fn__.<locals>.__eq__ at 0x7ffb961a5af0>),
 ('__ge__', <function __create_fn__.<locals>.__ge__ at 0x7ffb961a5d30>),
 ('__gt__', <function __create_fn__.<locals>.__gt__ at 0x7ffb961a5ca0>),
 ('__hash__', <function __create_fn__.<locals>.__hash__ at 0x7ffb961a5ee0>),
 ('__init__', <function __create_fn__.<locals>.__init__ at 0x7ffb961a59d0>),
 ('__le__', <function __create_fn__.<locals>.__le__ at 0x7ffb961a5c10>),
 ('__lt__', <function __create_fn__.<locals>.__lt__ at 0x7ffb961a5b80>),
 ('__repr__', <function __create_fn__.<locals>.__repr__ at 0x7ffb961a5940>),
 ('__setattr__',
  <function __create_fn__.<locals>.__setattr__ at 0x7ffb961a5dc0>)]

Python’s implementation of data classes is neat indeed. For more information, head over to YouTube and watch the video. If you are learning Python, I’d recommend subscribing, mCoding’s videos are very informative.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s