how to create empty numpy array to store different kinds of data

Question

I am trying to start with an empty numpy array. As the code progresses the first column should be filled with datetime.datetime, the second column should be filled with str, the third columns with float, and fourth column with int.

I tried the following:

A = np.empty([10, 4])
A[0][0] = datetime.datetime(2016, 10, 1, 1, 0)

I get the error:

TypeError: float() argument must be a string or a number

Structured arrays could be one option.

Divakar
– Divakar

2016-10-15 13:29:55 +00:00
Commented Oct 15, 2016 at 13:29 — Divakar
– Divakar, Commented Oct 15, 2016 at 13:29

hpaulj · Accepted Answer · 2016-10-15 19:05:50Z

A structured array approach:

define a dtype according to your column specs:

In [460]: dt=np.dtype('O,U10,f,i')
In [461]: from datetime import datetime

Initalize an empty array, with 3 elements (not 3x4)

In [462]: A = np.empty((3,), dtype=dt)
In [463]: A
Out[463]: 
array([(None, '', 0.0, 0), (None, '', 0.0, 0), (None, '', 0.0, 0)], 
      dtype=[('f0', 'O'), ('f1', '<U10'), ('f2', '<f4'), ('f3', '<i4')])

fill in some values - by field name (not column number)

In [464]: A['f1']=['one','two','three']
In [465]: A['f0'][0]=datetime(2016, 10, 1, 1, 0)    
In [467]: A['f2']=np.arange(3)
In [468]: A
Out[468]: 
array([(datetime.datetime(2016, 10, 1, 1, 0), 'one', 0.0, 0),
       (None, 'two', 1.0, 0), 
       (None, 'three', 2.0, 0)], 
      dtype=[('f0', 'O'), ('f1', '<U10'), ('f2', '<f4'), ('f3', '<i4')])

View on element of this array:

In [469]: A[0]
Out[469]: (datetime.datetime(2016, 10, 1, 1, 0), 'one', 0.0, 0)

I chose to make the 1st field object dtype, so it can hold a datetime object - which isn't a number or string.

np.datetime64 stores a date as a float, and provides a lot of functionality that datetime objects don't:

In [484]: dt1=np.dtype('datetime64[s],U10,f,i')
In [485]: A1 = np.empty((3,), dtype=dt1)
In [486]: A1['f0']=datetime(2016, 10, 1, 1, 0)
In [487]: A1['f3']=np.arange(3)
In [488]: A1
Out[488]: 
array([(datetime.datetime(2016, 10, 1, 1, 0), '', 0.0, 0),
       (datetime.datetime(2016, 10, 1, 1, 0), '', 0.0, 1),
       (datetime.datetime(2016, 10, 1, 1, 0), '', 0.0, 2)], 
      dtype=[('f0', '<M8[s]'), ('f1', '<U10'), ('f2', '<f4'), ('f3', '<i4')])

A third approach is to make the whole array object dtype. That's effectively a glorified list. Many operations resort to plain iteration, or just aren't implemented. It's more general but you loose a lot of the power of normal numeric arrays.

This was super useful in combination with the docs: docs.scipy.org/doc/numpy-1.10.1/user/basics.rec.html

DerWeh · Accepted Answer · 2016-10-15 13:35:46Z

4

You can use dtype=object.

A = np.empty([10, 4], dtype=object)
A[0][0] = datetime.datetime(2016, 10, 1, 1, 0)

It is also possible to use structured arrays, but then you have a fixed length for string objects. If you need arbitrary big objects you have to use dtype=object. But this often contradicts the purpose of arrays.

answered Oct 15, 2016 at 13:35

DerWeh

1,8691 gold badge22 silver badges27 bronze badges

2 Comments

Zanam Over a year ago

What do you mean "contradicts purpose of arrays"? Does it affect performance?

DerWeh Over a year ago

Never saw this comment, sorry. Don't have any reference, so please crosscheck before relying on my answer: An array with dtype=object is not as efficient. The size of the objects is not known, therefore no contiguous memory can be reserved. Most likely, the array simply saves pointers to the memory location. Therefore, dtype=object array give you the convenience of NumPy (advanced slicing and similar things), but not all the performance benefit.

Collectives™ on Stack Overflow

how to create empty numpy array to store different kinds of data

2 Answers 2

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related