0

I have a dataframe that contains 3 channels of measured data recorded at various depths.

       5     5        5       10     10     10
       x     y        z       x       y     z
1   -22.2    0.9    -88.6   -124.8  -76.7    83.2
2   -94.7   -67.9   -162.6  -200.8  -159.0   2.2
3   -128.7  -99.7   -196.4  -248.5  -219.8  -46.8
4   -127.8  -98.4   -195.1  -256.4  -239.1  -55.7
5   -141.0  -110.9  -208.8  -275.2  -265.7  -76.9
6   -142.1  -111.5  -209.6  -280.7  -276.3  -83.3
7   -147.1  -116.0  -214.6  -287.8  -286.0  -91.6
8   -149.2  -117.8  -216.7  -291.5  -290.9  -96.0

The dataframe is multi indexed using a repeating sequence of X, Y and Z (for each of the 3 components) and a floating point depth, as follows:

c = list(itertools.repeat(['x','y', 'z'], n))
col_a = list(itertools.chain(*c))

col_b = natsorted (depths * 3)

df.columns = [cola, colb]

Where n is the number of depths and depths is a user defined list of floats describing the depth of each measurement (5 and 10 in the example table above).

I would like to be able to create subsets of the data (to write to csv or to plot on the screen) from either of the column index levels. Selecting the component (X, Y or Z) isn't an issue.

x1 = df['x']
x1.to_csv(x_out.csv')

However, selecting all columns from a particular depth doesn't work

x1 = df['10']

I have tried various forms .ix and .loc but I think that the problem may lie in the float data type of the "depth" coumns key.

My question is, is there a way to select the subset based upon a column key of floating point values or would I be better of using a different method here?

3
  • 1
    Usually one would make a depth column and only have one each of x, y, and z. Then you could select df[df.depth == 10] Commented Jan 10, 2014 at 11:50
  • The data is made up of 7120 samples (rows) from 3 sensors. at each depth, 3 columns of data are produced (X,Y,Z @5 m; X,Y,Z @10 m etc.). A depth column wouldn't allow for all of the samples from each measured channel to be labeled at the depth it was collected, would it? Commented Jan 10, 2014 at 12:08
  • it would. Your depth column would look like [5,5,5,5,5,10,10,10,10,10,15,15,15,51,15]. Commented Jan 10, 2014 at 18:26

2 Answers 2

1

Try this:

import numpy as np
import pandas as pd
import itertools

c = list(itertools.repeat(['x','y', 'z'], 3))
col_a = list(itertools.chain(*c))

depths = [5.0, 5.0, 5.0, 10.0, 10.0, 10.0, 20.0, 20.0, 20.0]
names = list("xyzxyzxyz")

df = pd.DataFrame(np.random.rand(8, 9))
df.columns = pd.MultiIndex.from_arrays((depths, names))
print df[10]

output:

          x         y         z
0  0.767859  0.274721  0.986447
1  0.166864  0.143640  0.896246
2  0.029581  0.951677  0.626415
3  0.822003  0.358323  0.061943
4  0.764663  0.955426  0.831934
5  0.192194  0.001171  0.181386
6  0.649342  0.186907  0.109016
7  0.360859  0.163483  0.597824

to select "x":

df.xs("x", 1, level=1)

output:

         5         10        20
0  0.075749  0.767859  0.691237
1  0.305108  0.166864  0.595809
2  0.432526  0.029581  0.317391
3  0.410563  0.822003  0.884315
4  0.865121  0.764663  0.808828
5  0.590033  0.192194  0.657932
6  0.658829  0.649342  0.006082
7  0.677408  0.360859  0.320102
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for replying. This works great with the depth key but gives a key error with print df['x'] . Is this because the column keys are input in the opposite order to my original code above?
I modified the answer, please check it.
0

I agree with @U2EF1. For example, lets take the first row from your data above and make it two rows based on the depth value

       x     y        z     depth
1   -22.2    0.9    -88.6   5
2   -124.8  -76.7    83.2   10

you can then do lots of commands in pandas to organize the data based on depth.

df[df.depth == x] (as U2EF1 suggested)
df.groupby('depth')  # This + unstack() can be great for plotting
df['depth'].value_counts()   # I always use this for sanity checks

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.