0

I have a function to calculate the average vector for each name which is made of many words, this function is returning numpy.ndarray with shape of (100,). The resulting vector is as the following:

[ 0.00127441  0.0002633   0.00039622  0.00055501  0.00070984 -0.00089766
 -0.00073814 -0.00224919  0.00233035 -0.00037628  0.00125402 -0.00052623
  0.00114087 -0.00070441 -0.00419099  0.00031204 -0.0002703  -0.00290918
  ...(13 lines)
0.00260704 -0.00000406 -0.00160876  0.00134342]

As upon receiving the numpy array, I am removing line breaks as follows:

temp = ["%.8f" % number for number in name_avg_vector]
temp=re.sub('\s+', ' ', temp)
name_avg_vector= np.array(list(temp))

but I am getting the following error:

---> 79     temp=re.sub('\s+', ' ', name_avg_vector)
TypeError: cannot use a string pattern on a bytes-like object

I also tried changing the printoptions, but I continue having the break line in the file storing the numpy array values:

import sys
np.set_printoptions(threshold=sys.maxsize)
np.set_printoptions(threshold=np.inf)

After, I tried with array_repr to remove the break line:

name_avg_vector = np.array_repr(name_avg_vector).replace('\n', '')

but it saves as:

['array([-0.00849786,  0.00113221, -0.00643946,  0.00437448, -0.00740928,        0.00381133,  0.00178376, -0.00065115, -0.00050142,       -0.0001178 ,  0.00029183,  0.00015484, -0.00001569,  0.0006973 ,        0.00051486,  0.00006652, -0.00099618, -0.00049231,  0.0003479 ,        0.00135821,  0.00078396,  0.00038927,  0.00040825, -0.00093267,        0.00025755, -0.00012063, -0.00074733,  0.00120466,  0.00041425,       -0.00062592,  0.00098112,  0.00101578, -0.00048335,  0.00079251,       -0.00112981, 
...
-0.00050014,  0.00133685, -0.00020537, -0.00082505])']  

As stated by Anoyz in here, converting to list gets rid of break lines such as name_avg_vector.tolist().

Thanks

6
  • What line breaks are you removing? Where do you see these? Your numpy array doesn't actually contain any line breaks. Numpy only generates the line breaks when you display the array. Commented Oct 10, 2019 at 21:36
  • For instance, the first array content posted includes: 0.00127441 0.0002633 0.00039622 0.00055501 0.00070984 -0.00089766 where after -0.00089766 there is a \n to split the line, every 6 float numbers, the array is broken to the next line. I read that linewidth=75 by default, The shape of this array is (100,) Commented Oct 10, 2019 at 21:41
  • "where after -0.00089766 there is a \n to split the line" So there are linebreaks when you display the array with something like print(name_avg_vector). This isn't data stored in the array. Commented Oct 10, 2019 at 21:42
  • I though it was the data itself because it was stored with break lines in the file. Later when I applied np.array_repr() the break lines were gone, but the legend 'array(.. was added. Commented Oct 10, 2019 at 21:46
  • 1
    How are you 'receiving' and processing this 'array'? Sounds like you are trying to work with the string representation of the array, rather than the array itself. It is hard to recreate an array from its print string - with those line breaks, spaces and ellipses. You should try to work with the array object itself. If you need to save it to a file, use np.save and np.load to retrieve it. Of savetxt if it is 2d and you want a text csv style file. Commented Oct 11, 2019 at 4:00

1 Answer 1

1

Your numpy array appears to have dtype float so it doesn't actually contain any new lines. I assume what you are seeing are linebreaks when you do something like print(name_avg_vector). One way to solve the problem is to write your own loop to print the values in the format you want.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.