This problem stems down to the way floating point numbers are stored in a computer.
In short: They are stored as a limited amount of significant figures, a base, and an exponent. The computer then knows to scale the significant figures by the base raised to the exponent in order to obtain the value. The amount of data is machine specific: for a 32-bit machine, 23-bits are used for the significant figure, 8 for the exponent, and 1 for the base, a 64-bit machine will have 53 bits for the sig figs, 8 for the exponent, and 1 for the base.
Addition and Subtraction are then done by adding/subtracting the difference between the significant digits and the exponent.
You are generating very large integers for hash(k) and trying to take the difference between the rounded down int(y*T) and the floating point y*T. When the Python interpreter tries to take the difference between a float and an int, it converts the int to a floating point y*T will store a certain number of significant figures. The problem arises when you try to get a low order of magnitude difference from two high order magnitude numbers, or generally anytime the order of magnitude for the difference differs greatly from the numbers involved. The low order significant digits will then be lost in calculation.
Here is the a version I edited to test out your method. The added argument c is a constant which I suspect will help normalize your results.
import math
def h(k,M,c):
y = hash(k)
print "hash = ", y
T = (math.sqrt(5)-1)/(2*c)
print "y*T = ", y*T
print "int(y*T) = ", int(y*T)
print "(y*T) - int(y * T) = ",(y*T) - int(y * T)
print "M*((y*T) - int(y * T)) = ", M*((y*T) - int(y * T))
return(int(M*((y*T) - int(y * T))))
print(h('test',2,c))
As you increase c, essentially making the difference in the two numbers occur at a closer and closer order of magnitude, you begin to see that the value for (y*T) - int(y * T) moves away from 0. Sample output is below:
>>>h('test',2,10)
hash = 2314058222102390712
y*T = 1.43016663321e+17
int(y*T) = 143016663320543088
(y*T) - int(y * T) = 0.0
M*((y*T) - int(y * T)) = 0.0
h(test,2,10) = 0
>>>h('test',2,1000)
hash = 2314058222102390712
y*T = 1.43016663321e+15
int(y*T) = 1430166633205430
(y*T) - int(y * T) = 0.75
M*((y*T) - int(y * T)) = 1.5
h(test,2,1000) = 1
>>>h('test',2,10000000)
hash = 2314058222102390712
y*T = 1.43016663321e+11
int(y*T) = 143016663320
(y*T) - int(y * T) = 0.543090820312
M*((y*T) - int(y * T)) = 1.08618164062
h(test,2,10000000) = 1
>>>h('test',2,10000000000000)
hash = 2314058222102390712
y*T = 143016.663321
int(y*T) = 143016
(y*T) - int(y * T) = 0.66332054307
M*((y*T) - int(y * T)) = 1.32664108614
h(test,2,10000000000000) = 1
As an added example of the phenomenon I am talking about:
y = hash('test')
print y
y = float(y)
print y
y = int(y)
print y
outputs:
2314058222102390712
2.3140582221e+18
2314058222102390784
Just by simply switching to a float and back to an int the last two digits are no longer reliable, so it can be seen that anything below this will also be lost.
Mis undefined:afterh(k)