Pyspark explode string column containing JSON nested in array laterally

Question

I have a dataframe containing a column like:

df['metrics'] = [{id=1,name=XYZ,value=3}, {id=2,name=KJH,value=2}] [{id=4,name=ABC,value=7}, {id=8,name=HGS,value=9}]

The column is a String type, and I am trying to explode the column using :

    from pyspark.sql import functions as F
from pyspark.sql.types import ArrayType

array_item_schema = spark.read.json(df.rdd.map(lambda row: row['metrics'])).schema

json_array_schema = ArrayType(array_item_schema, True)

arrays_df = df.select(F.from_json('metrics', json_array_schema).alias('json_arrays'))

objects_df = arrays_df.select(F.explode('json_arrays').alias('objects'))

However, I have a null value returned when I try

objects_df.show()

The output I am looking for is a separated list of each element in the 'metrics' column, with column names showing id, name, value, in the same dataframe, and don't know where to start to decode it. Thanks for the help!

@MohanaBC the code shown here throws an 'invalid syntax' in pyspark... — user10
– user10, Commented Dec 12, 2022 at 16:57
that's scala code convert that into python syntax. method names are same in pyspark and spark-scala. — Mohana B C
– Mohana B C, Commented Dec 12, 2022 at 17:19
I have very little exposure to spark scala, and am lost here. Any help would be appreciated in converting that code! — user10
– user10, Commented Dec 12, 2022 at 17:56

Mohana B C · Accepted Answer · 2022-12-12 18:26:22Z

0

You can schema_of_json function to get schema from JSON string and pass it to from_json function get struct type.

  json_array_schema = schema_of_json(str(df.select("metrics").first()[0]))
  arrays_df = df.select(from_json('metrics', json_array_schema).alias('json_arrays'))

answered Dec 12, 2022 at 18:26

Mohana B C

5,4721 gold badge13 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Pyspark explode string column containing JSON nested in array laterally

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related