0

Description: I have a GUI that allows the user to add variables that are displayed in a dataframe. As the variables are added, they are automatically numbered, ex.'FIELD_0' and 'FIELD_1' etc and each variable has a value associated with it. The data is actually row-based instead of column based, in that the 'FIELD' ids are in column 0 and progress downwards and the corresponding value is in column 1, in the same corresponding row. As shown below:

     0          1
0    FIELD_0    HH_5_MILES
1    FIELD_1    POP_5_MILES

The user is able to reorder these values and move them up/down a row. However, it's important that the number ordering remains sequential. So, if the user positions 'FIELD_1' above 'FIELD_0' then it gets re-numbered appropriately. Example:

     0          1
0    FIELD_0    POP_5_MILES
1    FIELD_1    HH_5_MILES

Currently, I'm using the below code to perform this adjustment - this same re-numbering occurs with other variable names within the same dataframe.

df = pandas.DataFrame({0:['FIELD_1','FIELD_0']})
variable_list = ['FIELD', 'OPERATOR', 'RESULT']

for var in variable_list:
    field_list = ['%s_%s' % (var, _) for _, field_name in enumerate(df[0].isin([var]))]
    field_count = 0

    for _, field_name in enumerate(df.loc[:, 0]):
        if var in field_name:
            df.loc[_, 0] = field_list[field_count]
            field_count += 1

This gets me the result I want, but it seems a bit inelegant. If there is a better way, I'd love to know what it is.

2
  • 1
    Maybe I'm not understanding the constraints here, but your output is just the same as df.sort_values(0).reset_index(drop=True) Commented Jul 30, 2021 at 19:39
  • I updated the description to hopefully be a bit more clear about the data with the dataframe. Unfortunately it's not simply a matter of sorting the dataframe, it's sorting the variable names based on their position relative to other variables of the same name and intermixed with other variable names that are also ordered. Commented Jul 31, 2021 at 16:00

2 Answers 2

1

It appears you're looking to overwrite the Field values so that they always appear in order starting with 0.

We can filter to only rows which str.contains the word FIELD. Then assign those to a list comprehension like field_list.

import pandas as pd

# Modified DF
df = pd.DataFrame({0: ['FIELD_1', 'OTHER_1', 'FIELD_0', 'OTHER_0']})

# Select Where Values are Field
m = df[0].str.contains('FIELD')
# Overwrite field with new values by iterating over the total matches
df.loc[m, 0] = [f'FIELD_{n}' for n in range(m.sum())]

print(df)

df:

         0
0  FIELD_0
1  OTHER_1
2  FIELD_1
3  OTHER_0

For multiple variables:

import pandas as pd

# Modified DF
df = pd.DataFrame({0: ['FIELD_1', 'OTHER_1', 'FIELD_0', 'OTHER_0']})

variable_list = ['FIELD', 'OTHER']

for v in variable_list:
    # Select Where Values are Field
    m = df[0].str.contains(v)
    # Overwrite field with new values by iterating over the total matches
    df.loc[m, 0] = [f'{v}_{n}' for n in range(m.sum())]

df:

         0
0  FIELD_0
1  OTHER_0
2  FIELD_1
3  OTHER_1
Sign up to request clarification or add additional context in comments.

Comments

1

You can use sort values as below:

def f(x):
    l=x.split('_')[1]
    return int(l)

df.sort_values(0, key=lambda col: [f(k) for k in col]).reset_index(drop=True)

         0
0  FIELD_0
1  FIELD_1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.