I have a (very large) table using pandas.DataFrame. It contains wordcounts from texts; the index is the wordlist:
one.txt third.txt two.txt
a 1 1 0
i 0 0 1
is 1 1 1
no 0 0 1
not 0 1 0
really 1 0 0
sentence 1 1 1
short 2 0 0
think 0 0 1
I want to sort the wordlist on the frequency of words in all texts. So I can easily create a Series which contains the frequency sum for each word (using the words as index). But how how can I sort on this list?
One easy way would be to add the list to the dataframe as column, sort on it and then delete it. For performance reasons I would like to avoid this.
Two other ways are described here, but the one duplicates the dataframe which is a problem because of its size, and the other creates a new index, but I need the information about the words further down the line.