1
Date Train Number Station 1 Station 2 Equipment Available?
2022-06-16 1111 North Central Y
2022-06-20 1111 North Central Y
2022-06-01 2222 North South Y
2022-06-02 2222 North South Y
2022-06-03 2222 North South Y
2022-06-04 2222 North South Y
2022-06-05 2222 North South Y
2022-06-06 2222 North South Y
2022-06-07 2222 North South Y
2022-06-08 2222 North South Y

I have a Pandas dataframe that looks like the one above that is sorted by Train Number and then Date. I would like to print a simple Python statement that says:

"For Train Number 1111 North to Central, we have equipment available on June 16th and June 20th.

For Train Number 2222 North to South, we have equipment available from June 1st to June 8th."

How am I able to do this?????

2
  • What have you tried yet? stackoverflow.com/help/how-to-ask Commented Jun 2, 2022 at 14:34
  • I have tried putting Train Numbers as keys in a dictionary and then the dates as a values, and then tried printing what I want to print by iterating through the keys and values, but I feel this is not the best way to do this. Commented Jun 2, 2022 at 14:44

3 Answers 3

1

I've made a little function which you can call on whatever df you want.

I find this solution more readable and flexible for further requests.

def equip_avail(df):
  for i in df['Train Number'].unique():
    date_start = df.Date.loc[(df['Train Number']==i)].min()
    date_end = df.Date.loc[(df['Train Number']==i)].max()
    from_start = df.Station1.loc[(df['Train Number']==i)].values[0]
    to_end = df.Station2.loc[(df['Train Number']==i)].values[0]
    print(f'For Train Number {i} {from_start} to {to_end}, we have equipment available from {date_start} to {date_end}.')

Then you call it like this:

equip_avail(df)

Result:

For Train Number 1111 North to Central, we have equipment available from 2022-06-16 to 2022-06-20.
For Train Number 2222 North to South, we have equipment available from 2022-06-01 to 2022-06-08.
Sign up to request clarification or add additional context in comments.

Comments

0

You could get the min and max values for each Train's Date with a groupby, dedupe the DataFrame to get the other columns (as they are repeated) and then print the results with some datetime formatting

df.loc[:, 'Date'] = pd.to_datetime(df['Date'])

g = df.groupby(['Train Number']).agg(date_min=pd.NamedAgg(column='Date', aggfunc='min'), date_max=pd.NamedAgg(column='Date', aggfunc='max'))
g = g.join(df_deduped, how='inner')

df_deduped = df.loc[:, 'Train Number':].drop_duplicates().set_index('Train Number')

for index, values in g.reset_index().iterrows():
    print(f'For Train Number {values["Train Number"]}, {values["Station 1"]} to {values["Station 2"]}, we have equipment available from {values["date_min"].strftime("%b %d")} to {values["date_max"].strftime("%b %d")}')

The output is -

For Train Number 1111, North to Central, we have equipment available from Jun 16 to Jun 20
For Train Number 2222, North to South, we have equipment available from Jun 01 to Jun 08

Comments

0

here is one way to do it. Group by Train, station1, station2, taking both min and max of the dates

Finally printing them out from the resulting df from groupby

df2=df.groupby(['TrainNumber', 'Station1', 'Station2'])['Date'].aggregate([min, max]).reset_index()
for idx, row in df2.iterrows():
    print("For Train Number {0} {1} to {2}, we have equipment available on {3} and {4}".format(
          row[0],row[1],row[2],  row[3] , row[4] ))
For Train Number 1111 North to Central, we have equipment available on 2022-06-16 and 2022-06-20
For Train Number 2222 North to South, we have equipment available on 2022-06-01 and 2022-06-08

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.