-1

I want to calculate frequency of a word in a sentence. My dataframe has a "Title" column which contains a sentence (String) in each row. This is my current approach:

# num times queryWord is in sentence / num words in sentence
list = df['Title'].str.count(queryWord) / len(df['Title'].str.split())

However, len(df['Title'].str.split()) returns the length of the "Title" column rather than the length of the array that is generated by split() in each row. How do I fix this?

0

1 Answer 1

0

This should do the trick:

list = df['Title'].str.count(queryWord) / df['Title'].str.split().str.len()

df['Title'].str.split() returns a pd.Series of list objects. That's why this question was marked as a duplicate.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, that did it. What is the meaning of .str?
Glad it worked, please accept the answer. In pandas, the Series object has string methods which you can access via .str.method_name. The reason you have to access them that way is that some of them have the same name as another method that has a different use. Examples of this are pd.Series.str.replace which does not work the same way as pd.Series.replace and pd.Series.str.get which does not work the same way as pd.Series.get

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.