0

I have following pandas data frame that I have build:

      dark  Mystery  adult  crime  action  comedy  cartoon  winter  snow  skiing
0001  0.00    0.000  0.000   0.00    0.00   0.000     0.00    0.56  0.65   0.789
0004  0.89    0.678 -0.423   0.12    0.00   0.000     0.00    0.00  0.00   0.000
0005  0.00    0.000  0.000   0.00    0.12   0.678    -0.89    0.00  0.00   0.000

I also have a list that has some of the row index values of the data frame. After filtering I want to have my new data frame with indexes matching the values in the list.

l = [001,005]

This is large data frame I am trying to figure out without iterating via loop.

[df.index[idx] for idx in l]

This is wrong but I feel I am close to the answer or may be not.

Result should be:

      dark  Mystery  adult  crime  action  comedy  cartoon  winter  snow  skiing
0001  0.00    0.000  0.000   0.00    0.00   0.000     0.00    0.56  0.65   0.789
0005  0.00    0.000  0.000   0.00    0.12   0.678    -0.89    0.00  0.00   0.000
1
  • df.ix[l] will return a view of the underlying data, where l is your list. Note that idx may be a more readable name than l. Commented Mar 18, 2015 at 23:34

2 Answers 2

3

How about using .loc:

df.loc[l]

Note, in your actual example, your indices are probably strings rather than integers. When you declare l = [0001, 0005] it's going to be evaluated as [1,5]. So you might want to use l = ["0001", "0005"] or use string formatting to convert the integers (as Jonathan Eunice shows in his answer).

As an aside, you should also avoid using lowercase l as a variable name, since it looks similar to 1 in many monospace typefaces.

Sign up to request clarification or add additional context in comments.

2 Comments

values in my list are in following format -> u'0001'
@Null-Hypothesis, I see. Your question seemed to indicate differently. Did it work for you?
1

If your DataFrame is in df:

newdf = df[df.index.isin(l)]

Of course, you have to be careful here. None of your items in l are truly in the index. l = [001,005] is the same as l = [1,5], whereas your index is really strings a la ['0001', '0002', ...]. Given that, you may want to "upgrade" your selection list l to be parallel to your index first:

l = ["{:04d}".format(i) for i in l ]
newdf = df[df.index.isin(l)]

2 Comments

values in my list are in following format -> u'0001' based on that I don't think the format is required thats by guess.
@Null-Hypothesis If your values are in that format, I agree. But in your question, you state l = [001,005] which would make them integers. Even if you assume they're strings, they have the wrong number of leading zeros. So, if the selection list is already more on-target than the question suggests, great! If not, you will need to homogenize the selection list with the DataFrame index.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.