100

What is the fastest way to copy data from array b to array a, without modifying the address of array a. I need this because an external library (PyFFTW) uses a pointer to my array that cannot change.

For example:

a = numpy.empty(n, dtype=complex)
for i in xrange(a.size):
  a[i] = b[i]

It is possible to do it without a loop?

8 Answers 8

96

I believe

a = numpy.empty_like(b)
a[:] = b

will copy the values quickly. As Funsi mentions, recent versions of numpy also have the copyto function.

Sign up to request clarification or add additional context in comments.

15 Comments

+1. But wouldn't numpy.empty be substantially fast than numpy.zeros?
@M.ElSaka a = b merely creates a new reference to b. a[:] = b means "set all elements of a equal to those of b". The difference is important because numpy arrays are mutable types.
@mg007 I ran some tests, which showed empty() is about 10% faster than zeros(). Surprisingly empty_like() is even faster. copyto(a,b) is faster than the array syntax a[:] = b. See gist.github.com/bhawkins/5095558
@Brian Hawkins is right. For when to use np.copyto(a, b) and when a = b.astype(b.dtype) for a speed improvement, see the answer below: stackoverflow.com/a/33672015/3703716
@MarcoSulla b=a.copy() creates a new array. We need to modify an existing object, not create a new one. (The OP requires a constant memory address.)
|
31

NumPy version 1.7 has the numpy.copyto function that does what you are looking for:

numpy.copyto(dst, src)

Copies values from one array to another, broadcasting as necessary.

See: https://docs.scipy.org/doc/numpy/reference/generated/numpy.copyto.html

1 Comment

This doesn't work for me. I get AttributeError: 'module' object has no attribute 'copyto'
23
a = numpy.array(b)

is even faster than the suggested solutions up to numpy v1.6 and makes a copy of the array as well. I could however not test it against copyto(a,b), since I don't have the most recent version of numpy.

1 Comment

This is a great way to copy an array, but it creates a new object. The OP needs to know how to quickly assign values to an array that's already been created.
19

To answer your question, I played with some variants and profiled them.

Conclusion: to copy data from a numpy array to another use one of the built-in numpy functions numpy.array(src) or numpy.copyto(dst, src) wherever possible.

Update 2022-05: re-test with numpy v1.22 and CPython v3.9 showed that src.astype(...) is currently fastest almost consistently on my system. So, better run the provided code snipped yourself to get numbers for your specific setup.

(But always choose numpy.copyto(dst, src) if dst's memory is already allocated, to reuse the memory. See profiling at the end of the post.)

profiling setup

import timeit
import numpy as np
import pandas as pd
from IPython.display import display
    
def profile_this(methods, setup='', niter=10 ** 4, p_globals=None, **kwargs):
    if p_globals is not None:
        print('globals: {0}, tested {1:.0e} times'.format(p_globals, niter))
    timings = np.array([timeit.timeit(method, setup=setup, number=niter,
                                      globals=p_globals, **kwargs) for 
                        method in methods])
    ranking = np.argsort(timings)
    timings = np.array(timings)[ranking]
    methods = np.array(methods)[ranking]
    speedups = np.amax(timings) / timings

    # pd.set_option('html', False)
    data = {'time (s)': timings,
            'speedup': ['{:.2f}x'.format(s) if s != 1 else '' for s in speedups],
            'methods': methods}
    data_frame = pd.DataFrame(data, columns=['time (s)', 'speedup', 'methods'])

    display(data_frame)
    print()

profiling code

setup = '''import numpy as np; x = np.random.random(n)'''
methods = (
    '''y = np.zeros(n, dtype=x.dtype); y[:] = x''',
    '''y = np.zeros_like(x); y[:] = x''',
    '''y = np.empty(n, dtype=x.dtype); y[:] = x''',
    '''y = np.empty_like(x); y[:] = x''',
    '''y = np.copy(x)''',
    '''y = x.astype(x.dtype)''',
    '''y = 1*x''',
    '''y = np.empty_like(x); np.copyto(y, x)''',
    '''y = np.empty_like(x); np.copyto(y, x, casting='no')''',
    '''y = np.empty(n)\nfor i in range(x.size):\n\ty[i] = x[i]'''
)

for n, it in ((2, 6), (3, 6), (3.8, 6), (4, 6), (5, 5), (6, 4.5)):
    profile_this(methods[:-1:] if n > 2 else methods, setup, 
                 niter=int(10 ** it), p_globals={'n': int(10 ** n)})

results for Windows 7 on Intel i7 CPU, CPython v3.5.0, numpy v1.10.1.

globals: {'n': 100}, tested 1e+06 times

     time (s) speedup                                            methods
0    0.386908  33.76x                                    y = np.array(x)
1    0.496475  26.31x                              y = x.astype(x.dtype)
2    0.567027  23.03x              y = np.empty_like(x); np.copyto(y, x)
3    0.666129  19.61x                     y = np.empty_like(x); y[:] = x
4    0.967086  13.51x                                            y = 1*x
5    1.067240  12.24x  y = np.empty_like(x); np.copyto(y, x, casting=...
6    1.235198  10.57x                                     y = np.copy(x)
7    1.624535   8.04x           y = np.zeros(n, dtype=x.dtype); y[:] = x
8    1.626120   8.03x           y = np.empty(n, dtype=x.dtype); y[:] = x
9    3.569372   3.66x                     y = np.zeros_like(x); y[:] = x
10  13.061154          y = np.empty(n)\nfor i in range(x.size):\n\ty[...


globals: {'n': 1000}, tested 1e+06 times

   time (s) speedup                                            methods
0  0.666237   6.10x                              y = x.astype(x.dtype)
1  0.740594   5.49x              y = np.empty_like(x); np.copyto(y, x)
2  0.755246   5.39x                                    y = np.array(x)
3  1.043631   3.90x                     y = np.empty_like(x); y[:] = x
4  1.398793   2.91x                                            y = 1*x
5  1.434299   2.84x  y = np.empty_like(x); np.copyto(y, x, casting=...
6  1.544769   2.63x                                     y = np.copy(x)
7  1.873119   2.17x           y = np.empty(n, dtype=x.dtype); y[:] = x
8  2.355593   1.73x           y = np.zeros(n, dtype=x.dtype); y[:] = x
9  4.067133                             y = np.zeros_like(x); y[:] = x


globals: {'n': 6309}, tested 1e+06 times

   time (s) speedup                                            methods
0  2.338428   3.05x                                    y = np.array(x)
1  2.466636   2.89x                              y = x.astype(x.dtype)
2  2.561535   2.78x              y = np.empty_like(x); np.copyto(y, x)
3  2.603601   2.74x                     y = np.empty_like(x); y[:] = x
4  3.005610   2.37x  y = np.empty_like(x); np.copyto(y, x, casting=...
5  3.215863   2.22x                                     y = np.copy(x)
6  3.249763   2.19x                                            y = 1*x
7  3.661599   1.95x           y = np.empty(n, dtype=x.dtype); y[:] = x
8  6.344077   1.12x           y = np.zeros(n, dtype=x.dtype); y[:] = x
9  7.133050                             y = np.zeros_like(x); y[:] = x


globals: {'n': 10000}, tested 1e+06 times

   time (s) speedup                                            methods
0  3.421806   2.82x                                    y = np.array(x)
1  3.569501   2.71x                              y = x.astype(x.dtype)
2  3.618747   2.67x              y = np.empty_like(x); np.copyto(y, x)
3  3.708604   2.61x                     y = np.empty_like(x); y[:] = x
4  4.150505   2.33x  y = np.empty_like(x); np.copyto(y, x, casting=...
5  4.402126   2.19x                                     y = np.copy(x)
6  4.917966   1.96x           y = np.empty(n, dtype=x.dtype); y[:] = x
7  4.941269   1.96x                                            y = 1*x
8  8.925884   1.08x           y = np.zeros(n, dtype=x.dtype); y[:] = x
9  9.661437                             y = np.zeros_like(x); y[:] = x


globals: {'n': 100000}, tested 1e+05 times

    time (s) speedup                                            methods
0   3.858588   2.63x                              y = x.astype(x.dtype)
1   3.873989   2.62x                                    y = np.array(x)
2   3.896584   2.60x              y = np.empty_like(x); np.copyto(y, x)
3   3.919729   2.58x  y = np.empty_like(x); np.copyto(y, x, casting=...
4   3.948563   2.57x                     y = np.empty_like(x); y[:] = x
5   4.000521   2.53x                                     y = np.copy(x)
6   4.087255   2.48x           y = np.empty(n, dtype=x.dtype); y[:] = x
7   4.803606   2.11x                                            y = 1*x
8   6.723291   1.51x                     y = np.zeros_like(x); y[:] = x
9  10.131983                   y = np.zeros(n, dtype=x.dtype); y[:] = x


globals: {'n': 1000000}, tested 3e+04 times

     time (s) speedup                                            methods
0   85.625484   1.24x                     y = np.empty_like(x); y[:] = x
1   85.693316   1.24x              y = np.empty_like(x); np.copyto(y, x)
2   85.790064   1.24x  y = np.empty_like(x); np.copyto(y, x, casting=...
3   86.342230   1.23x           y = np.empty(n, dtype=x.dtype); y[:] = x
4   86.954862   1.22x           y = np.zeros(n, dtype=x.dtype); y[:] = x
5   89.503368   1.18x                                    y = np.array(x)
6   91.986177   1.15x                                            y = 1*x
7   95.216021   1.11x                                     y = np.copy(x)
8  100.524358   1.05x                              y = x.astype(x.dtype)
9  106.045746                             y = np.zeros_like(x); y[:] = x


Also, see results for a variant of the profiling where the destination's memory is already pre-allocated during value copying, since y = np.empty_like(x) is part of the setup:

globals: {'n': 100}, tested 1e+06 times

   time (s) speedup                        methods
0  0.328492   2.33x                np.copyto(y, x)
1  0.384043   1.99x                y = np.array(x)
2  0.405529   1.89x                       y[:] = x
3  0.764625          np.copyto(y, x, casting='no')


globals: {'n': 1000}, tested 1e+06 times

   time (s) speedup                        methods
0  0.453094   1.95x                np.copyto(y, x)
1  0.537594   1.64x                       y[:] = x
2  0.770695   1.15x                y = np.array(x)
3  0.884261          np.copyto(y, x, casting='no')


globals: {'n': 6309}, tested 1e+06 times

   time (s) speedup                        methods
0  2.125426   1.20x                np.copyto(y, x)
1  2.182111   1.17x                       y[:] = x
2  2.364018   1.08x                y = np.array(x)
3  2.553323          np.copyto(y, x, casting='no')


globals: {'n': 10000}, tested 1e+06 times

   time (s) speedup                        methods
0  3.196402   1.13x                np.copyto(y, x)
1  3.523396   1.02x                       y[:] = x
2  3.531007   1.02x                y = np.array(x)
3  3.597598          np.copyto(y, x, casting='no')


globals: {'n': 100000}, tested 1e+05 times

   time (s) speedup                        methods
0  3.862123   1.01x                np.copyto(y, x)
1  3.863693   1.01x                y = np.array(x)
2  3.873194   1.01x                       y[:] = x
3  3.909018          np.copyto(y, x, casting='no')

4 Comments

Also x.copy() is as fast as np.array(x) and I like the syntax much more: $ python3 -m timeit -s "import numpy as np; x = np.random.random((100, 100))" "x.copy()" - 100000 loops, best of 3: 4.7 usec per loop. I have similar results for np.array(x). Tested on Linux with an i5-4210U and numpy 1.10.4
Yes Marco, it is rather a matter of personal taste. But note that np.copy is more forgiving: np.copy(False), np.copy(None) still work, while a = None; a.copy() throws AttributeError: 'NoneType' object has no attribute 'copy'. Also, we are more precise on declaring what we want to happen in this line of code using the function instead of the method syntax.
Well, the fact np.copy(None) does not throw an error is really unpythonic. One reason more to use a.copy() :)
I just ran these benchmarks with Python 2.7.12, NumPy 1.11.2 and find that y[:] = x is now marginally faster than copyto(y, x). Code and output at gist.github.com/bhawkins/7cdbd5b9372cb798e34e21f92279d2dc
10

you can easy use:

b = 1*a

this is the fastest way, but also have some problems. If you don't define directly the dtype of a and also doesn't check the dtype of b you can get into trouble. For example:

a = np.arange(10)        # dtype = int64
b = 1*a                  # dtype = int64

a = np.arange(10.)       # dtype = float64
b = 1*a                  # dtype = float64

a = np.arange(10)        # dtype = int64
b = 1. * a               # dtype = float64

I hope, I could make the point clear. Sometimes you will have a data type change with just one little operation.

4 Comments

No. Doing so creates a new array. It is equivalent to b = a.copy().
sorry, but I don't get you. What do you mean with create a new array? All the other methods which are presented here have the same behavior. a = numpy.zeros(len(b)) or a = numpy.empty(n,dtype=complex) will also create a new array.
Suppose you have a = numpy.empty(1000) . Now, you need to fill a with data, without changing its address in memory. If you do a[0] = 1, you don't recreate an array, you just change the content of the array.
@CharlesBrunet the array will have to be created at some point. This clever one-liner just do it all in one operation.
10

There are many different things you can do:

a=np.copy(b)
a=np.array(b) # Does exactly the same as np.copy
a[:]=b # a needs to be preallocated
a=b[np.arange(b.shape[0])]
a=copy.deepcopy(b)

Things that don't work

a=b
a=b[:] # This have given my code bugs 

Comments

3

Why not to use

a = 0 + b

I think it is similar to previous multiplication but might be simpler.

Comments

1

Assuming that the destination array a already exists, I can think of three options (two of which already mentioned in other answers):

a[...] = b
a[:] = b
np.copyto(a, b)

I have tested them in the case of contiguous arrays. All of them are approximately equally fast for large arrays (because the time is dominated by the actual copy time, which is done equally efficiently in all three). For small arrays the first one appears to be slightly faster than the second one, which is slightly faster than the third one. In terms of readability for me the first two are roughly equivalent (with a slight preference for the first one as it does not imply a slightly confusing iteration over the first dimension of the destination matrix). I find the last one less readable as I tend to forget if it's Intel (mov dst, src) or AT&T (mov src, dst) syntax, unless one uses named arguments (np.copyto(dst=a, src=b)) which may be a bit verbose. It doesn't help that copyto has the destination first while ufuncs have it last (e.g. np.sin(b, a) is equivalent to a[...] = np.sin(b) except for avoiding the creation of a temporary array).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.