How can I loop through a file of strings and load the strings into a numpy ndarray?
1 Answer
This will create a numpy vector of strings with a line per cell:
import numpy as np
with open("file.ext") as f:
a = np.array(f.readlines())
This could be modified for, say, a file of floating-point space-separated values:
import numpy as np
with open("file.ext") as f:
a = np.array([map(float, line.split()) for line in f])
Let's break down the argument to array() to clarify what's going on here.
[line for line in f]would be equivalent tof.readlines()- it creates a list of strings, one per line inf.[line.split() for line in f]makes a 2D list of strings. Each line infgetssplitat the spaces into a list of strings.- Numpy's designed to deal with numeric values, though, not strings. So we need to turn each sublist of strings into a list of floats.
mapapplies the same function to every element of a list (floatin this case), somap(float, line.split())will turn a list of strings into a list of floats. - So
[map(float, line.split()) for line in f]creates a list of lists of floats - one list per line, where the floats have been split up at the spaces. This then gets sent toarray()which knows how to deal with a list of lists.
Also look into the genfromtxt and loadtxt family of Numpy functions.
2 Comments
Superdooperhero
Why map(float and not map(string?
Benjamin Hodgson
@Superdooperhero -
line.split() turns a string (line) into a list of strings. We need to turn this list of strings into a list of floats (using map) so that Numpy can deal with them. Numpy's not really designed to handle arrays of strings.