6

is it a bad idea to implement __hash__ like so?

class XYZ:
    def __init__(self):
        self.val = None

    def __hash__(self):
        return id(self)

Am i setting up something potentially disastrous?

2
  • You mean, without implementing __eq__()? Most definitely a bad idea. Commented Mar 10, 2019 at 8:56
  • 1
    @JeffMercado the default __eq__ is identity, so why would it definitely be a bad idea? Commented Mar 10, 2019 at 9:00

3 Answers 3

6

The __hash__ method has to satisfy the following requirement in order to work:

Forall x, y such that x == y, then hash(x) == hash(y).

In your case your class does not implement __eq__ which means that x == y if and only if id(x) == id(y), and thus your hash implementation satisfy the above property.

Note however that if you do implement __eq__ then this implementation will likely fail.

Also: there is a difference between having a "valid" __hash__ and having a good hash. For example the following is a valid __hash__ definition for any class:

def __hash__(self):
    return 1

A good hash should try to distribute uniformly the objects as to avoid collisions as much as possible. Usually this requires a more complex definition. I'd avoid trying to come up with formulas and instead rely on python built-in hash function.

For example if your class has fields a, b and c then I'd use something like this as __hash__:

def __hash__(self):
    return hash((self.a, self.b, self.c))

The definition of hash for tuples should be good enough for the average case.

Finally: you should not define __hash__ in classes that are mutable (in the fields used for equality). That's because modifying the instances will change their hash and this will break things.

Sign up to request clarification or add additional context in comments.

1 Comment

While you're correct in saying that this __hash__ satisfies the properties required of it (as long as __eq__ isn't defined), defining this __hash__ offers no benefit over just inheriting object.__hash__.
3

It's either pointless or wrong, depending on the rest of the class.

If your objects use the default identity-based ==, then defining this __hash__ is pointless. The default __hash__ is also identity-based, but faster, and tweaked to avoid always having the low bits set to 0. Using the default __hash__ would be simpler and more efficient.

If you objects don't use the default identity-based ==, then your __hash__ is wrong, because it's going to be inconsistent with ==. If your objects are immutable, you should implement __hash__ in a way that would be consistent with ==; if your objects are mutable, you should not implement __hash__ at all (and set __hash__ = None if you need to support Python 2).

Comments

1

This is the default implementation of __hash__. Be aware, that implementing __eq__ causes the default __hash__ implementation to disappear. Should you reimplement __hash__ then any objects that compare equal must have an equal hash.

It is okay for non-equal objects to have the same hash value though. Therefore, having a hash implementation that returns a constant value is always safe. However, it is very inefficient.

A good default that works for a lot of use cases is to return a hash of the tuple of the attributes that are used in the __eq__ method. eg.

class XYZ:
    def __init__(self, val0, val1):
        self.val0 = val0
        self.val1 = val1

    def __eq__(self, other):
        return self.val0 == other.val1 and self.val1 == other.val1

    def __hash__(self):
        return hash((self.val0, self.val1))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.