How to re-number strings after sorting a dataframe

Question

Description: I have a GUI that allows the user to add variables that are displayed in a dataframe. As the variables are added, they are automatically numbered, ex.'FIELD_0' and 'FIELD_1' etc and each variable has a value associated with it. The data is actually row-based instead of column based, in that the 'FIELD' ids are in column 0 and progress downwards and the corresponding value is in column 1, in the same corresponding row. As shown below:

     0          1
0    FIELD_0    HH_5_MILES
1    FIELD_1    POP_5_MILES

The user is able to reorder these values and move them up/down a row. However, it's important that the number ordering remains sequential. So, if the user positions 'FIELD_1' above 'FIELD_0' then it gets re-numbered appropriately. Example:

     0          1
0    FIELD_0    POP_5_MILES
1    FIELD_1    HH_5_MILES

Currently, I'm using the below code to perform this adjustment - this same re-numbering occurs with other variable names within the same dataframe.

df = pandas.DataFrame({0:['FIELD_1','FIELD_0']})
variable_list = ['FIELD', 'OPERATOR', 'RESULT']

for var in variable_list:
    field_list = ['%s_%s' % (var, _) for _, field_name in enumerate(df[0].isin([var]))]
    field_count = 0

    for _, field_name in enumerate(df.loc[:, 0]):
        if var in field_name:
            df.loc[_, 0] = field_list[field_count]
            field_count += 1

This gets me the result I want, but it seems a bit inelegant. If there is a better way, I'd love to know what it is.

Maybe I'm not understanding the constraints here, but your output is just the same as df.sort_values(0).reset_index(drop=True) — Randy
– Randy, Commented Jul 30, 2021 at 19:39
I updated the description to hopefully be a bit more clear about the data with the dataframe. Unfortunately it's not simply a matter of sorting the dataframe, it's sorting the variable names based on their position relative to other variables of the same name and intermixed with other variable names that are also ordered. — spareTimeCoder
– spareTimeCoder, Commented Jul 31, 2021 at 16:00

Henry Ecker · Accepted Answer · 2021-08-05 16:59:00Z

It appears you're looking to overwrite the Field values so that they always appear in order starting with 0.

We can filter to only rows which str.contains the word FIELD. Then assign those to a list comprehension like field_list.

import pandas as pd

# Modified DF
df = pd.DataFrame({0: ['FIELD_1', 'OTHER_1', 'FIELD_0', 'OTHER_0']})

# Select Where Values are Field
m = df[0].str.contains('FIELD')
# Overwrite field with new values by iterating over the total matches
df.loc[m, 0] = [f'FIELD_{n}' for n in range(m.sum())]

print(df)

df:

         0
0  FIELD_0
1  OTHER_1
2  FIELD_1
3  OTHER_0

For multiple variables:

import pandas as pd

# Modified DF
df = pd.DataFrame({0: ['FIELD_1', 'OTHER_1', 'FIELD_0', 'OTHER_0']})

variable_list = ['FIELD', 'OTHER']

for v in variable_list:
    # Select Where Values are Field
    m = df[0].str.contains(v)
    # Overwrite field with new values by iterating over the total matches
    df.loc[m, 0] = [f'{v}_{n}' for n in range(m.sum())]

df:

         0
0  FIELD_0
1  OTHER_0
2  FIELD_1
3  OTHER_1

IoaTzimas · Accepted Answer · 2021-07-30 20:03:45Z

1

You can use sort values as below:

def f(x):
    l=x.split('_')[1]
    return int(l)

df.sort_values(0, key=lambda col: [f(k) for k in col]).reset_index(drop=True)

         0
0  FIELD_0
1  FIELD_1

edited Jul 30, 2021 at 20:03

answered Jul 30, 2021 at 19:49

IoaTzimas

10.7k2 gold badges15 silver badges32 bronze badges

Collectives™ on Stack Overflow

How to re-number strings after sorting a dataframe

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related