2

I was given a 20GB HDF5 file created using pandas, but unfortunately written in the fixed format (rather than table) with each column written as a separate key. This works nicely for quickly loading one feature, but it doesn't allow handy table-oriented procedures (e.g., statistical analysis or plotting).

Trying to load the file as a whole gives the following error:

ValueError: key must be provided when HDF5 file contains multiple datasets.

f=pd.read_hdf('file_path')

ValueError                             Traceback (most recent call last)

384             for group_to_check in groups[1:]:
385                 if not _is_metadata_of(group_to_check, candidate_only_group):

--> 386                     raise ValueError('key must be provided when HDF5 file '
    387                                      'contains multiple datasets.')
    388             key = candidate_only_group._v_pathname

ValueError: key must be provided when HDF5 file contains multiple datasets.

Unfortunately 'key' doesn't accept a python list, so I can't simply load all at once. Is there a way to convert the h5 file from 'fixed' to 'table'? Or to load the file to a dataframe in one go? At the moment my solution is to load each column separately and append to an empty dataframe.

1 Answer 1

3

I don't know any other way that loading the df column by column but you can greatly automate this using HDFStore instead of read_hdf:

with pd.HDFStore(filename) as h5:
    df = pd.concat(map(h5.get, h5.keys()), axis=1)

Example:

#save df as multiple datasets
df = pd.DataFrame({'a': [1,2], 'b': [10,20]})
df.a.to_hdf('/tmp/df.h5', 'a', mode='w', format='fixed')
df.b.to_hdf('/tmp/df.h5', 'b', mode='a', format='fixed')

#read columns and concat to dataframe    
with pd.HDFStore('/tmp/df.h5') as h5:
    df1 = pd.concat(map(h5.get, h5.keys()), axis=1)

#verify
assert all(df1 == df)
Sign up to request clarification or add additional context in comments.

1 Comment

You may also find my answer here helpful as I've seen you commented on the other answer there.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.