68

I am trying to add one column to the array created from recfromcsv. In this case it's an array: [210,8] (rows, cols).

I want to add a ninth column. Empty or with zeroes doesn't matter.

from numpy import genfromtxt
from numpy import recfromcsv
import numpy as np
import time

if __name__ == '__main__':
 print("testing")
 my_data = recfromcsv('LIAB.ST.csv', delimiter='\t')
 array_size = my_data.size
 #my_data = np.append(my_data[:array_size],my_data[9:],0)

 new_col = np.sum(x,1).reshape((x.shape[0],1))
 np.append(x,new_col,1)
2
  • 1
    And what doesn't work about this? Commented Apr 4, 2013 at 15:52
  • The thing that is not working is that it doesn't give me the correct dimensions no matter what version I try. Commented Apr 4, 2013 at 18:13

8 Answers 8

103

I think that your problem is that you are expecting np.append to add the column in-place, but what it does, because of how numpy data is stored, is create a copy of the joined arrays

Returns
-------
append : ndarray
    A copy of `arr` with `values` appended to `axis`.  Note that `append`
    does not occur in-place: a new array is allocated and filled.  If
    `axis` is None, `out` is a flattened array.

so you need to save the output all_data = np.append(...):

my_data = np.random.random((210,8)) #recfromcsv('LIAB.ST.csv', delimiter='\t')
new_col = my_data.sum(1)[...,None] # None keeps (n, 1) shape
new_col.shape
#(210,1)
all_data = np.append(my_data, new_col, 1)
all_data.shape
#(210,9)

Alternative ways:

all_data = np.hstack((my_data, new_col))
#or
all_data = np.concatenate((my_data, new_col), 1)

I believe that the only difference between these three functions (as well as np.vstack) are their default behaviors for when axis is unspecified:

  • concatenate assumes axis = 0
  • hstack assumes axis = 1 unless inputs are 1d, then axis = 0
  • vstack assumes axis = 0 after adding an axis if inputs are 1d
  • append flattens array

Based on your comment, and looking more closely at your example code, I now believe that what you are probably looking to do is add a field to a record array. You imported both genfromtxt which returns a structured array and recfromcsv which returns the subtly different record array (recarray). You used the recfromcsv so right now my_data is actually a recarray, which means that most likely my_data.shape = (210,) since recarrays are 1d arrays of records, where each record is a tuple with the given dtype.

So you could try this:

import numpy as np
from numpy.lib.recfunctions import append_fields
x = np.random.random(10)
y = np.random.random(10)
z = np.random.random(10)
data = np.array( list(zip(x,y,z)), dtype=[('x',float),('y',float),('z',float)])
data = np.recarray(data.shape, data.dtype, buf=data)
data.shape
#(10,)
tot = data['x'] + data['y'] + data['z'] # sum(axis=1) won't work on recarray
tot.shape
#(10,)
all_data = append_fields(data, 'total', tot, usemask=False)
all_data
#array([(0.4374783740738456 , 0.04307289878861764, 0.021176067323686598, 0.5017273401861498),
#       (0.07622262416466963, 0.3962146058689695 , 0.27912715826653534 , 0.7515643883001745),
#       (0.30878532523061153, 0.8553768789387086 , 0.9577415585116588  , 2.121903762680979 ),
#       (0.5288343561208022 , 0.17048864443625933, 0.07915689716226904 , 0.7784798977193306),
#       (0.8804269791375121 , 0.45517504750917714, 0.1601389248542675  , 1.4957409515009568),
#       (0.9556552723429782 , 0.8884504475901043 , 0.6412854758843308  , 2.4853911958174133),
#       (0.0227638618687922 , 0.9295332854783015 , 0.3234597575660103  , 1.275756904913104 ),
#       (0.684075052174589  , 0.6654774682866273 , 0.5246593820025259  , 1.8742119024637423),
#       (0.9841793718333871 , 0.5813955915551511 , 0.39577520705133684 , 1.961350170439875 ),
#       (0.9889343795296571 , 0.22830104497714432, 0.20011292764078448 , 1.4173483521475858)], 
#      dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8'), ('total', '<f8')])
all_data.shape
#(10,)
all_data.dtype.names
#('x', 'y', 'z', 'total')
Sign up to request clarification or add additional context in comments.

3 Comments

Getting File "d:\python27\lib\site-packages\numpy\core_methods.py", line 18, in _sum out=out, keepdims=keepdims) TypeError: cannot perform reduce with flexible type
@user2130951 What is the dtype of your array? my_data.dtype
@user2130951 Are you sure you don't want to add a field?
20

If you have an array, a of say 210 rows by 8 columns:

a = numpy.empty([210,8])

and want to add a ninth column of zeros you can do this:

b = numpy.append(a,numpy.zeros([len(a),1]),1)

5 Comments

This generates return concatenate((arr, values), axis=axis) ValueError: all the input arrays must have same number of dimensions
hmmmm. Just double checked. It works for me (using IDLE - python version 2.7)
Perhaps it is because, as suggested by @askewchan, you actually have a recarry? I think this would work if you import using numpy.genfromtxt or numpy.loadtxt?
If the shape of the column is (X, ) then you have to .reshape(X, 1) to apply append. This is the case after extracting the columns with data[:,1]
This works for me using python 3.5, but, YES, one must be careful with shapes to use it.
1

The easiest solution is to use numpy.insert().

The Advantage of np.insert() over np.append is that you can insert the new columns into custom indices.

import numpy as np

X = np.arange(20).reshape(10,2)

X = np.insert(X, [0,2], np.random.rand(X.shape[0]*2).reshape(-1,2)*10, axis=1)
'''

1 Comment

What's happening in the reshape part at the end?
1

np.append or np.hstack expects the appended column to be the proper shape, that is N x 1. We can use np.zeros to create this zeros column (or np.ones to create a ones column) and append it to our original matrix (2D array).

def append_zeros(x):
    zeros = np.zeros((len(x), 1))  # zeros column as 2D array
    return np.hstack((x, zeros))   # append column

Comments

1

Here's a shorter one-liner:

import numpy as np

data = np.random.rand(210, 8)
data = np.c_[data, np.zeros(len(data))]

Something that I use often to convert points to homogenous coordinates with np.ones instead.

Comments

0

I add a new column with ones to a matrix array in this way:

Z = append([[1 for _ in range(0,len(Z))]], Z.T,0).T

Maybe it is not that efficient?

1 Comment

Don't use that list comprehension, use np.ones or np.ones_like: append([np.ones_like(Z)], Z.T, 0).T
0

It can be done like this:

import numpy as np

# create a random matrix:
A = np.random.normal(size=(5,2))

# add a column of zeros to it:
print(np.hstack((A,np.zeros((A.shape[0],1)))))

In general, if A is an m*n matrix, and you need to add a column, you have to create an n*1 matrix of zeros, then use "hstack" to add the matrix of zeros to the right of the matrix A.

Comments

0

Similar to some of the other answers suggesting using numpy.hstack, but more readable:

import numpy as np

# declare 10 rows x 3 cols integer array of all 1s
arr = np.ones((10, 3), dtype=np.int64)

# get the number of rows in the original array (as if we didn't know it was 10 or it could be different in other cases)
numRows = arr.shape[0]
# declare the new array which will be the new column, integer array of all 0s so it's visually distinct from the original array
additionalColumn = np.zeros((numRows, 1), dtype=np.int64)

# use hstack to tack on the additionl column
result = np.hstack((arr, additionalColumn))

print(result)

result:

$ python3 scratchpad.py 
[[1 1 1 0]
 [1 1 1 0]
 [1 1 1 0]
 [1 1 1 0]
 [1 1 1 0]
 [1 1 1 0]
 [1 1 1 0]
 [1 1 1 0]
 [1 1 1 0]
 [1 1 1 0]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.