I'm very new to Python, so apologies in advance if my question has already been asked.
I have a large dataset, k_cc, that contains degree sequences for different years. Sometimes, the length of the degree sequences for each year vary. I am trying to generate a series of configuration models using these degree sequences over all years present in the data, so that I can extract a couple of measures I need for my analyses. I know how to run the code for one year, but I don't know how to loop over the years, since their lengths vary.
Below is a reproducible example of my problem, shown for one year.
import networkx as nx
import pandas as pd
# Data
k_cc = {'degree': [4,4,6,3,7,8,6,3,5,1,4,2,8,9,4],
'Year': [1990, 1990, 1990, 1991, 1991, 1991, 1992, 1992, 1992, 1992, 1992, 1993, 1993, 1993, 1994]}
k_cc = pd.DataFrame(k_cc)
k_cc
Out[13]:
degree Year
0 3 1990
1 4 1990
2 6 1990
3 3 1991
4 7 1991
5 8 1991
6 6 1992
7 3 1992
8 5 1992
9 1 1992
10 4 1992
11 2 1993
12 8 1993
13 9 1993
14 4 1994
# Analyses for one year
k_cc_1990 = k_cc[k_cc['Year']==1990]
k_cc_1990 = k_cc_1990["degree"]
k_cc_1990 = k_cc_1990.values.tolist()
# Generate a configuration model
net_meas_random = pd.DataFrame(columns = ['cluscoef','avlen'])
for i in range(10):
cm = nx.configuration_model(k_cc_1990)
cm = nx.Graph(cm)
cm.remove_edges_from( nx.selfloop_edges(cm) )
net_meas_random.loc[i,'cluscoef'] = nx.average_clustering(cm)
Gcc_cm = sorted(nx.connected_components(cm), key=len, reverse=True )
H_cm = cm.subgraph(Gcc_cm[0]).copy()
net_meas_random.loc[i,'avlen'] = nx.average_shortest_path_length(H_cm)
results = {'Mean_Clus_Coeff': [net_meas_random['cluscoef'].mean()],
'StdDev_Clus_Coeff': [net_meas_random['cluscoef'].std()],
'Mean_ave_short_path_leng': [net_meas_random['avlen'].mean()],
'StdDev_ave_short_path_leng': [net_meas_random['avlen'].std()],
'Year': [1990]}
results = pd.DataFrame(results)
Many thanks in advance for any tips!
nx.configuration_model()which has an even sum.degree. In addition, running your code I get the following error:Invalid degree sequence: sum of degrees must be even, not odd. At which part of your code there is a problem when using different lengths? I have not seen any hard-coded values on first view.