27

I'm self learning python and have found a problem which requires down sampling a feature vector. I need some help understanding how down-sampling a array. in the array each row represents an image by being number from 0 to 255. I was wonder how you apply down-sampling to the array? I don't want to scikit-learn because I want to understand how to apply down-sampling. If you could explain down-sampling too that would be amazing thanks.

the feature vector is 400x250

3 Answers 3

43

If with downsampling you mean something like this, you can simply slice the array. For a 1D example:

import numpy as np
a = np.arange(1,11,1)
print(a)
print(a[::3])

The last line is equivalent to:

print(a[0:a.size:3])

with the slicing notation as start:stop:step

Result:

[ 1 2 3 4 5 6 7 8 9 10]

[ 1 4 7 10]

For a 2D array the idea is the same:

b = np.arange(0,100)
c = b.reshape([10,10])
print(c[::3,::3])

This gives you, in both dimensions, every third item from the original array.

Or, if you only want to down sample a single dimension:

d = np.zeros((400,250))
print(d.shape)
e = d[::10,:]
print(e.shape) 

(400, 250)

(40, 250)

The are lots of other examples in the Numpy manual

Sign up to request clarification or add additional context in comments.

5 Comments

but how do you do this for a 2d array
I updated the answer. But I'm not sure to what size you want to downsample your original 400x250 array?
Saying "this doesn't work" isn't very helpful. What doesn't work? Or even better: could you provide a simple example of exactly how the down sampling should work (e.g., from a 2D array [[0,1,..,9],[10,11,..,19], the down sampled array should contain elements [[1,3,..],[11,13,..]])? Which items should be kept? Or you mention that you don't want to use scikit-learn, but which routine should it reproduce?
I don't think this can answer what OP was trying to ask. What he meant is down-sampling in Machine Learning.
@Phan Nhat Huy, yes, now (5-6 years later) I would also interpret OP's question different. Feel free to write an answer more suitable for ML.
0

from skimage.measure import block_reduce
new_matrix=block_reduce(Matrix_for_downsample,block_size=(m,n),func=np.mean/np.max/..)

2 Comments

As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.
The question explicitly mentions that OP does not want to use scikit-learn.
-1

If you want to downsample along certain dimensions, you can use mean, which not only decimates, but also downsamples. Below example: Downsamples an ndarray of size (h,w,3) along axes 0,1, but not along dim 3:

def downsample_2x(arrayn3d):
    """
        Downsamples an ndarray of size `(h,w,3)` along axes 0,1 (along w,h)
        Input can be non-float, e.g. uint8
    """
    dtype1 = array3d.dtype
    a = array3d.astype(float)
    (h,w,_) = a.shape
    assert w % 2 == 0
    assert h % 2 == 0
    w2 = int(w/2)
    h2 = int(h/2)
    a = a.reshape((h,w2,2,3))
    a = np.mean(a, axis=2)
    assert a.shape == (h,w2,3)
    a = a.reshape((h2,2,w2,3))
    a = np.mean(a, axis=1)
    assert a.shape == (h2,w2,3)
    a = np.floor(a).astype(dtype1)
    return a

Which gives a matrix of size (w/2,h/2,3). If w and h are not even numbers, it will be slightly more complicated. This is not the most efficient way to do it, but the steps and ideas should be clear.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.