3

I am trying to start with an empty numpy array. As the code progresses the first column should be filled with datetime.datetime, the second column should be filled with str, the third columns with float, and fourth column with int.

I tried the following:

A = np.empty([10, 4])
A[0][0] = datetime.datetime(2016, 10, 1, 1, 0)

I get the error:

TypeError: float() argument must be a string or a number
1
  • 1
    Structured arrays could be one option. Commented Oct 15, 2016 at 13:29

2 Answers 2

6

A structured array approach:

define a dtype according to your column specs:

In [460]: dt=np.dtype('O,U10,f,i')
In [461]: from datetime import datetime

Initalize an empty array, with 3 elements (not 3x4)

In [462]: A = np.empty((3,), dtype=dt)
In [463]: A
Out[463]: 
array([(None, '', 0.0, 0), (None, '', 0.0, 0), (None, '', 0.0, 0)], 
      dtype=[('f0', 'O'), ('f1', '<U10'), ('f2', '<f4'), ('f3', '<i4')])

fill in some values - by field name (not column number)

In [464]: A['f1']=['one','two','three']
In [465]: A['f0'][0]=datetime(2016, 10, 1, 1, 0)    
In [467]: A['f2']=np.arange(3)
In [468]: A
Out[468]: 
array([(datetime.datetime(2016, 10, 1, 1, 0), 'one', 0.0, 0),
       (None, 'two', 1.0, 0), 
       (None, 'three', 2.0, 0)], 
      dtype=[('f0', 'O'), ('f1', '<U10'), ('f2', '<f4'), ('f3', '<i4')])

View on element of this array:

In [469]: A[0]
Out[469]: (datetime.datetime(2016, 10, 1, 1, 0), 'one', 0.0, 0)

I chose to make the 1st field object dtype, so it can hold a datetime object - which isn't a number or string.

np.datetime64 stores a date as a float, and provides a lot of functionality that datetime objects don't:

In [484]: dt1=np.dtype('datetime64[s],U10,f,i')
In [485]: A1 = np.empty((3,), dtype=dt1)
In [486]: A1['f0']=datetime(2016, 10, 1, 1, 0)
In [487]: A1['f3']=np.arange(3)
In [488]: A1
Out[488]: 
array([(datetime.datetime(2016, 10, 1, 1, 0), '', 0.0, 0),
       (datetime.datetime(2016, 10, 1, 1, 0), '', 0.0, 1),
       (datetime.datetime(2016, 10, 1, 1, 0), '', 0.0, 2)], 
      dtype=[('f0', '<M8[s]'), ('f1', '<U10'), ('f2', '<f4'), ('f3', '<i4')])

A third approach is to make the whole array object dtype. That's effectively a glorified list. Many operations resort to plain iteration, or just aren't implemented. It's more general but you loose a lot of the power of normal numeric arrays.

Sign up to request clarification or add additional context in comments.

1 Comment

This was super useful in combination with the docs: docs.scipy.org/doc/numpy-1.10.1/user/basics.rec.html
4

You can use dtype=object.

A = np.empty([10, 4], dtype=object)
A[0][0] = datetime.datetime(2016, 10, 1, 1, 0)

It is also possible to use structured arrays, but then you have a fixed length for string objects. If you need arbitrary big objects you have to use dtype=object. But this often contradicts the purpose of arrays.

2 Comments

What do you mean "contradicts purpose of arrays"? Does it affect performance?
Never saw this comment, sorry. Don't have any reference, so please crosscheck before relying on my answer: An array with dtype=object is not as efficient. The size of the objects is not known, therefore no contiguous memory can be reserved. Most likely, the array simply saves pointers to the memory location. Therefore, dtype=object array give you the convenience of NumPy (advanced slicing and similar things), but not all the performance benefit.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.