188

Say I have the following DataFrame

Letter    Number
A          1
B          2
C          3
D          4

Which can be obtained through the following code

import pandas as pd

letters = pd.Series(('A', 'B', 'C', 'D'))
numbers = pd.Series((1, 2, 3, 4))
keys = ('Letters', 'Numbers')
df = pd.concat((letters, numbers), axis=1, keys=keys)

Now I want to get the value C from the column Letters.

The command line

df[df.Letters=='C'].Letters

will return

2    C
Name: Letters, dtype: object

How can I get only the value C and not the whole two line output?

1
  • 12
    On an unrelated note, there's a nicer way to contruct your DataFrame :pd.DataFrame({'Letters': letters, 'Numbers': numbers}) Commented Jun 11, 2015 at 19:15

5 Answers 5

255
df[df.Letters=='C'].Letters.item()

This returns the first element in the Index/Series returned from that selection. In this case, the value is always the first element.

EDIT:

Or you can run a loc() and access the first element that way. This was shorter and is the way I have implemented it in the past.

Sign up to request clarification or add additional context in comments.

4 Comments

I love this method, however I'm getting the warning: FutureWarning: "item" has been deprecated and will be removed in a future version
@AlexG: you can use this instead: df[df.Letters=='C'].Letters.iloc[0]. It produces the first element (which is also the unique) in the result series.
using loc[:1] still shows index next to the value :(
@AlexG and @Sonic Soul : try using df[df.Letters=='C'].Letters.squeeze() instead. This works the same way. :)
99

Use the values attribute to return the values as a np array and then use [0] to get the first value:

In [4]:
df.loc[df.Letters=='C','Letters'].values[0]

Out[4]:
'C'

EDIT

I personally prefer to access the columns using subscript operators:

df.loc[df['Letters'] == 'C', 'Letters'].values[0]

This avoids issues where the column names can have spaces or dashes - which mean that accessing using ..

3 Comments

It's really inconsequential, but in your selection you access the column 'Letters' using the dot notation; df.loc[df.Letters=='C']. If there are spaces in your column names, you should probably be using converters to strip those out, like you would if importing from a CSV or Excel file.
@thomas-ato I'll update my answer but I disagree with modding the columns as an additional step unless that is necessary, in this case I agree it makes no difference
@EdChum.. In this scenarion : how can we handle error: "IndexError: index 0 is out of bounds for axis 0 with size 0 "
5

You can use loc with the index and column labels.

df.loc[2, 'Letters']
# 'C'

If you prefer the "Numbers" column as reference, you can set it as index.

df.set_index('Numbers').loc[3, 'Letters']

I find this cleaner as it does not need the [0] or .item().

2 Comments

This doesn't address the particular issue. If the index is unknown, your code doesn't help.
The second version (setting one column to index) does apply in that case. :)
4
import pandas as pd

dataset = pd.read_csv("data.csv")
values = list(x for x in dataset["column name"])

>>> values[0]
'item_0'

edit:

actually, you can just index the dataset like any old array.

import pandas as pd

dataset = pd.read_csv("data.csv")
first_value = dataset["column name"][0]

>>> print(first_value)
'item_0'

Comments

1

I think a good option is to turn your single line DataFrame into a Series first, then index that:

df[df.Letters=='C'].squeeze()['Letters']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.