An example of my numpy array scores and its m number of nearest neighbors;
scores = np.random.normal(-0.2,0.01,1000)
m = np.int(np.sqrt(scores.shape[0])+0.5)
I want to compare the ith value in scores with its m nearest neighbors (index-wise). The comparison should be done by something similar to
x[i] = (scores[i]-np.mean(scores[m])) / np.sum(scores[m])
, where np.mean(scores[m]),np.sum[scores[m]] represents the mean and sum of the m nearest neighbors of scores. If it can handle the first and last m indices, that's a bonus. With x as a numpy array I should be able to use something similar to
scores[x > threshold]
to get all scores that exceeds a certain threshold. The idea is to call scores[i] an outlier if it exceeds this particular threshold.