How to iterate over the values in a list within a loop

Question

I'm very new to Python, so apologies in advance if my question has already been asked.

I have a large dataset, k_cc, that contains degree sequences for different years. Sometimes, the length of the degree sequences for each year vary. I am trying to generate a series of configuration models using these degree sequences over all years present in the data, so that I can extract a couple of measures I need for my analyses. I know how to run the code for one year, but I don't know how to loop over the years, since their lengths vary.

Below is a reproducible example of my problem, shown for one year.

import networkx as nx
import pandas as pd

# Data
k_cc = {'degree':  [4,4,6,3,7,8,6,3,5,1,4,2,8,9,4],
        'Year': [1990, 1990, 1990, 1991, 1991, 1991, 1992, 1992, 1992, 1992, 1992, 1993, 1993, 1993, 1994]}
k_cc = pd.DataFrame(k_cc)

k_cc
Out[13]: 
    degree  Year
0        3  1990
1        4  1990
2        6  1990
3        3  1991
4        7  1991
5        8  1991
6        6  1992
7        3  1992
8        5  1992
9        1  1992
10       4  1992
11       2  1993
12       8  1993
13       9  1993
14       4  1994

# Analyses for one year
k_cc_1990 = k_cc[k_cc['Year']==1990]
k_cc_1990 = k_cc_1990["degree"]
k_cc_1990 = k_cc_1990.values.tolist()

# Generate a configuration model
net_meas_random = pd.DataFrame(columns = ['cluscoef','avlen'])

for i in range(10):                                   
    cm = nx.configuration_model(k_cc_1990)                      
    cm = nx.Graph(cm)                              
    cm.remove_edges_from( nx.selfloop_edges(cm) )    
    net_meas_random.loc[i,'cluscoef'] = nx.average_clustering(cm)
    Gcc_cm = sorted(nx.connected_components(cm), key=len, reverse=True )   
    H_cm = cm.subgraph(Gcc_cm[0]).copy()
    net_meas_random.loc[i,'avlen'] = nx.average_shortest_path_length(H_cm)

results = {'Mean_Clus_Coeff':  [net_meas_random['cluscoef'].mean()],
        'StdDev_Clus_Coeff': [net_meas_random['cluscoef'].std()],
        'Mean_ave_short_path_leng':  [net_meas_random['avlen'].mean()],
        'StdDev_ave_short_path_leng': [net_meas_random['avlen'].std()],
        'Year': [1990]}
results = pd.DataFrame(results)

Many thanks in advance for any tips!

Did you provide the actual data? Because you would need to provide a degree sequence to nx.configuration_model() which has an even sum. — AveragePythonEnjoyer
– AveragePythonEnjoyer, Commented Jul 29, 2022 at 9:05
In your example there is a typo with degree. In addition, running your code I get the following error: Invalid degree sequence: sum of degrees must be even, not odd. At which part of your code there is a problem when using different lengths? I have not seen any hard-coded values on first view. — Jacob
– Jacob, Commented Jul 29, 2022 at 9:07
Sorry, I had created a toy dataset that looked like mine, and I hadn't checked if it ran. I edited the question, and this new sample dataset runs. — Djoustaine
– Djoustaine, Commented Jul 29, 2022 at 9:13

Jacob · Accepted Answer · 2022-08-01 09:50:46Z

1

If your second code example works for every given year you could do the following:

1.Define a function that does your analyses:

def eval_seq(data, year):
    k_cc=data
    #Put the second code here. 
    return results

2.Call your function as loop:

results={} # dict for storing all results
for year in sorted(list(set(k_cc['Year']))): #get a List of all years in your dataset
    results[year]=eval_seq(k_cc, year)

EDIT

I was not able to recreate you error. However, the example data was still wrong. Please notice the modifications given below:

import networkx as nx
import pandas as pd

# Data
data = {'degree':  [4,4,6,3,7,8,6,3,5,1,5,2,8,8,4],
        'Year': [1990, 1990, 1990, 1991, 1991, 1991, 1992, 1992, 1992, 1992, 1992, 1993, 1993, 1993, 1994]}
k_cc = pd.DataFrame(data)

Two numbers were changed due to errors.

def eval_seq(data, year):
    k_cc=data.copy()
    
    #Put the second code here. 
    # Change 1990 to year

    # Analyses for one year
    k_cc_1990 = k_cc[k_cc['Year']==year]
    k_cc_1990 = k_cc_1990["degree"]
    k_cc_1990 = k_cc_1990.values.tolist()

    # Generate a configuration model
    net_meas_random = pd.DataFrame(columns = ['cluscoef','avlen'])

    for i in range(10):                                   
        cm = nx.configuration_model(k_cc_1990)                      
        cm = nx.Graph(cm)                              
        cm.remove_edges_from( nx.selfloop_edges(cm) )    
        net_meas_random.loc[i,'cluscoef'] = nx.average_clustering(cm)
        Gcc_cm = sorted(nx.connected_components(cm), key=len, reverse=True )   
        H_cm = cm.subgraph(Gcc_cm[0]).copy()
        net_meas_random.loc[i,'avlen'] = nx.average_shortest_path_length(H_cm)
    
    # Changed results to scalars instead of one-element arrays
    results = {'Mean_Clus_Coeff':  net_meas_random['cluscoef'].mean(),
            'StdDev_Clus_Coeff': net_meas_random['cluscoef'].std(),
            'Mean_ave_short_path_leng':  net_meas_random['avlen'].mean(),
            'StdDev_ave_short_path_leng': net_meas_random['avlen'].std(),
            'Year': year}
#     results = pd.DataFrame(results)
    return(results)

I have reduced the [] in your result for simplification. No need for one-element arrays.

results={} # dict for storing all results
for year in sorted(list(set(k_cc['Year'].values.tolist()))): #get a List of all years in your dataset
    results[year]=eval_seq(k_cc, year)
print(results)
df=pd.DataFrame(results)
df.head()

This will run without error and the result is also converted into a DataFrame.

	1990	1991	1992	1993	1994
Mean_Clus_Coeff	0.5	0.7	0.383333	0.4	0
StdDev_Clus_Coeff	0.527046	0.483046	0.279881	0.516398	0
Mean_ave_short_path_leng	1.16667	1.1	1.5	1.2	0
StdDev_ave_short_path_leng	0.175682	0.161015	0.105409	0.172133	0
Year	1990	1991	1992	1993	1994

edited Aug 1, 2022 at 9:50

answered Jul 29, 2022 at 9:14

Jacob

2941 silver badge6 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Djoustaine Over a year ago

I ran it, but I get an error Unsupported operand type(s) for +: 'int' and 'str'.

Jacob Over a year ago

At which part of your code you got this error?

Djoustaine Over a year ago

In the last part, when I call the function as loop.

Djoustaine Over a year ago

Thank you so much! I've ran it on the original code and it works perfectly (and fast) :)

Collectives™ on Stack Overflow

How to iterate over the values in a list within a loop

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related