0

I have a csv file that looks like below

[0.037621960043907166, 0.04622473940253258, 0.9161532521247864]
[0.030109738931059837, 0.03261643648147583, 0.9372738003730774]
[0.030109738931059837, 0.03261643648147583, 0.9372738003730774]

I need to convert this to numpy array. If I use below code

data = genfromtxt(file, delimiter=',', encoding="utf8")

I get nan in the output.

If I do this

np.genfromtxt (file, encoding=None, dtype = None)

It fails to remove the starting and ending brackets of the list and outputs like

array = ([['[0.037621960043907166,', '0.04622473940253258,',
        '0.9161532521247864]'],
       ['[0.030109738931059837,', '0.03261643648147583,',
        '0.9372738003730774]'],
       ['[0.030109738931059837,', '0.03261643648147583,',
        '0.9372738003730774]']], dtype='<U22')

the expected output is

array = ([['0.037621960043907166,', '0.04622473940253258,',
            '0.9161532521247864'],
           ['0.030109738931059837,', '0.03261643648147583,',
            '0.9372738003730774'],
           ['0.030109738931059837,', '0.03261643648147583,',
            '0.9372738003730774']], dtype='<U22')

How can I get the expected output? Seems I need to remove the brackets 1st before applying the numpy operations. Any suggestion?

3 Answers 3

1

As long as you know the format of the content, I think a simple slicing will do

import numpy as np

tmp = open('tmp', 'r').readlines()
tmp = np.array([[float(num) for num in item[1:-2].split(',')] for item in tmp])
Sign up to request clarification or add additional context in comments.

Comments

0

what you need is eval()

from numpy import array
with open('your file name', 'r') as f:
    str_lines = f.readLines()
    lines = [eval(x) for x in str_lines]
    ary = array(lines)
f.close()

Comments

0

When you have text file like:

[0.037621960043907166, 0.04622473940253258, 0.9161532521247864]
[0.030109738931059837, 0.03261643648147583, 0.9372738003730774]
[0.030109738931059837, 0.03261643648147583, 0.9372738003730774]

You can try this:

np.genfromtxt(filename,dtype=str,encoding=None, converters ={0: lambda s: s.strip('['), 2:lambda s: s.strip(']')}, delimiter = ',')

Output:

array([['0.037621960043907166', ' 0.04622473940253258',
        ' 0.9161532521247864'],
       ['0.030109738931059837', ' 0.03261643648147583',
        ' 0.9372738003730774'],
       ['0.030109738931059837', ' 0.03261643648147583',
        ' 0.9372738003730774']], dtype='<U20')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.