I have a dataframe containing a column like:
df['metrics'] = [{id=1,name=XYZ,value=3}, {id=2,name=KJH,value=2}] [{id=4,name=ABC,value=7}, {id=8,name=HGS,value=9}]
The column is a String type, and I am trying to explode the column using :
from pyspark.sql import functions as F
from pyspark.sql.types import ArrayType
array_item_schema = spark.read.json(df.rdd.map(lambda row: row['metrics'])).schema
json_array_schema = ArrayType(array_item_schema, True)
arrays_df = df.select(F.from_json('metrics', json_array_schema).alias('json_arrays'))
objects_df = arrays_df.select(F.explode('json_arrays').alias('objects'))
However, I have a null value returned when I try
objects_df.show()
The output I am looking for is a separated list of each element in the 'metrics' column, with column names showing id, name, value, in the same dataframe, and don't know where to start to decode it. Thanks for the help!