We have the following dtypes in our pandas dataframe:
>>> results_df.dtypes
_id int64
playerId int64
leagueId int64
firstName object
lastName object
fullName object
shortName object
gender object
nickName object
height float64
jerseyNum object
position object
teamId int64
updated datetime64[ns, UTC]
teamMarket object
conferenceId int64
teamName object
updatedDate object
competitionIds object
dtype: object
The object types are not helpful in the .dtypes output here since some columns are ordinary strings (eg. firstName, lastName), whereas other columns are more complex (competitionIds is an numpy.ndarray of int64s).
We'd like to convert competitionIds, and any other columns that are numpy.ndarray columns, into list columns, without explicitly passing competitionIds, since it's not always known which columns are the numpy.ndarray columns. So, even though this works: results_df['competitionIds'] = results_df['competitionIds'].apply(list), it doesn't entirely solve the problem because I'm explicitly passing competitionIds here, whereas we need to automatically detect which columns are the numpy.ndarray columns.
all(isinstance(x, np.ndarray) for x in column_that's_object)or so?column_that's_objecthere would be a list of column names?np.ndarray. what doestype(...)for the first element of the column give?shouldbe consistent but there is some missing data in these tables.competitionIdsin particular has empty / missing values.