2
>>> df = pd.DataFrame({'num_legs': [4, 2], 'num_wings': [0, 2]},
...                   index=['dog', 'hawk'])
>>> df
      num_legs  num_wings
dog          4          0
hawk         2          2
>>> for row in df.itertuples():
...     print(row)
...
Pandas(Index='dog', num_legs=4, num_wings=0)
Pandas(Index='hawk', num_legs=2, num_wings=2)

I am parsing an excel sheet using pandas.DataFrame.itertuples which will give me a pandas.DataFrame over each iteration. Consider the pandas.DataFrame returned in each iteration as shown above.

Now off the each data frame Pandas(Index='dog', num_legs=4, num_wings=0) I would like to access the values using the keyword num_legs however upon using the same I get the below exception.

TypeError: tuple indices must be integers, not str

Could someone help on how to retrieve the data from the data frames using the column headers directly.

3 Answers 3

4

I faced the same error when using a variable.

v = 'num_legs'
for row in df.itertuples():
    print(row[v])

TypeError: tuple indices must be integers or slices, not str

To use df.itertuples() and use the attribute name as a variable.

v = 'num_legs'
for row in df.itertuples():
    print(getattr(row, v))

At the end df.itertuples() is faster than df.iterrows().

Sign up to request clarification or add additional context in comments.

3 Comments

How did you evaluated to conclude that itertuples is faster than iterrows ?
You can check this link and last week I have tested the same with large dataframes.
And also this
1

Here:

for row in df.itertuples():
    print(row.num_legs)
  # print(row.num_wings)   # Other column values

# Output
4
2

5 Comments

accepting this since I was using itertuples to iterate over data frames.
I tried to use the same when reading a csv using read_csv however my first row after comments in csv is not being treated as column names and I get exception while using row["columnHeader"]
While that's a separate question which you should raise, but as a hint play with header argument.
Tried to use header argument , unfortunately the csv have extra column data apart from column header and hence upon using the header argument the parsing fails
@darth_coder Then I suggest you should ask a separate question, by listing only this problem with proper explanation.
1

you could use iterrows(),

for u,row in df.iterrows():
    print(u)
    print (row)
    print (row['num_legs'])

O/P:

dog
num_legs     4
num_wings    0
Name: dog, dtype: int64
4
hawk
num_legs     2
num_wings    2
Name: hawk, dtype: int64
2

1 Comment

This answer is also correct and I would now use iterrows while coding rather than itertuples since the way data is accessed mimics array index operator.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.