is it a bad idea to implement __hash__ like so?
class XYZ:
def __init__(self):
self.val = None
def __hash__(self):
return id(self)
Am i setting up something potentially disastrous?
The __hash__ method has to satisfy the following requirement in order to work:
Forall x, y such that x == y, then hash(x) == hash(y).
In your case your class does not implement __eq__ which means that x == y if and only if id(x) == id(y), and thus your hash implementation satisfy the above property.
Note however that if you do implement __eq__ then this implementation will likely fail.
Also: there is a difference between having a "valid" __hash__ and having a good hash. For example the following is a valid __hash__ definition for any class:
def __hash__(self):
return 1
A good hash should try to distribute uniformly the objects as to avoid collisions as much as possible. Usually this requires a more complex definition.
I'd avoid trying to come up with formulas and instead rely on python built-in hash function.
For example if your class has fields a, b and c then I'd use something like this as __hash__:
def __hash__(self):
return hash((self.a, self.b, self.c))
The definition of hash for tuples should be good enough for the average case.
Finally: you should not define __hash__ in classes that are mutable (in the fields used for equality). That's because modifying the instances will change their hash and this will break things.
__hash__ satisfies the properties required of it (as long as __eq__ isn't defined), defining this __hash__ offers no benefit over just inheriting object.__hash__.It's either pointless or wrong, depending on the rest of the class.
If your objects use the default identity-based ==, then defining this __hash__ is pointless. The default __hash__ is also identity-based, but faster, and tweaked to avoid always having the low bits set to 0. Using the default __hash__ would be simpler and more efficient.
If you objects don't use the default identity-based ==, then your __hash__ is wrong, because it's going to be inconsistent with ==. If your objects are immutable, you should implement __hash__ in a way that would be consistent with ==; if your objects are mutable, you should not implement __hash__ at all (and set __hash__ = None if you need to support Python 2).
This is the default implementation of __hash__. Be aware, that implementing __eq__ causes the default __hash__ implementation to disappear. Should you reimplement __hash__ then any objects that compare equal must have an equal hash.
It is okay for non-equal objects to have the same hash value though. Therefore, having a hash implementation that returns a constant value is always safe. However, it is very inefficient.
A good default that works for a lot of use cases is to return a hash of the tuple of the attributes that are used in the __eq__ method. eg.
class XYZ:
def __init__(self, val0, val1):
self.val0 = val0
self.val1 = val1
def __eq__(self, other):
return self.val0 == other.val1 and self.val1 == other.val1
def __hash__(self):
return hash((self.val0, self.val1))
__eq__()? Most definitely a bad idea.__eq__is identity, so why would it definitely be a bad idea?