I have a dataset which I am querying with SQL. My query returns a long string which simply contains the column names and then the data, with rows separated by newline characters. I then use numpy.genfromtxt to turn this long string into a numpy array.
However, there are a few columns that should be read as strings. So, I am explicitly passing a dtype array to genfromtxt so that it saves the column values correctly. However, when I inspect the output, all column entries that should be a string simply appear as '', an empty string.
I am declaring the data type of these columns as str. As an example, one such entry that is turning into an empty string is, in the original dataset, the word GALAXY. However, on the official docs for the dataset, it is listed that the data type of this column is varchar. I assumed str would be the correct type for this, but I guess not.
Edit: Ignore that this has anything to with SQL. Basically, I have a string that is the result of a query, and I need to pack it into a numpy array using np.genfromtxt. I avoided posting the explicit strings because they are brutal to look at, but here is one:
b'bestObjID,ra,dec,z,zErr,zWarning,class,subClass,rChi2,DOF,rChi2Diff,z_noqso,zErr_noqso,zWarning_noqso,class_noqso,subClass_noqso,rChi2Diff_noqso,velDisp,velDispErr,velDispZ,velDispZErr,velDispChi2\n1237662340012638224,239.58334,27.233419,0.09080672,2.924875E-05,0,GALAXY,,1.104714,3735,1.411605,0,0,0,,,0,272.6187,13.61222,0,0,1815.653\n'
As you can see, it is a bytes object with rows separated by \n and the first row being the column labels.
The result of passing this to np.genfromtxt is
array((1237662340012638224, 239.58334, 27.233419, 0.09080672264099121, 2.9248749342514202e-05, 0, '', '', 1.104714035987854, 3735.0, 1.4116050004959106, 0.0, 0.0, 0, '', '', 0.0, 272.61871337890625, 13.61221981048584, 0.0, 0.0, 1815.6529541015625),
dtype=[('bestObjID', '<i8'), ('ra', '<f8'), ('dec', '<f8'), ('z', '<f4'), ('zErr', '<f4'), ('zWarning', '<i8'), ('class', '<c16'), ('subClass', '<c16'), ('rChi2', '<f4'), ('DOF', '<f4'), ('rChi2Diff', '<f4'), ('z_noqso', '<f4'), ('zErr_noqso', '<f4'), ('zWarning_noqso', '<i8'), ('class_noqso', '<c16'), ('subClass_noqso', '<c16'), ('rChi2Diff_noqso', '<f4'), ('velDisp', '<f4'), ('velDispErr', '<f4'), ('velDispZ', '<f4'), ('velDispZErr', '<f4'), ('velDispChi2', '<f4')])
You can see how what should say 'GALAXY' turns into '' when I specify that the data type of this entry is str. If I instead use the c dataype, I can recover the G of GALAXY, but nothing more. If I try to use c8 or c16, I get (nan+0j)
numpylibrary is not meant to be used as a DBAPI. If you're manipulating/reading data from a normal SQL db, can you also clarify how you ended up with trying to parse the results withnumpy? It sounds like that's probably where your real problem is. Also you may want to read up on how to write an issue with a Minimal, Complete, and Verifiable examplenumpyarray viagenfromtxt. I'll update the post with some more infodtype=Noneget you to what you want? and a mcve is a minimal complete verifiable example, which this is not, hence the downvote someone probably gave you. I voted it up to get it to 0 because this is a legitimate question that just needs some massaging, but you should probably still post your exactnp.genfromtxtcall.