0

I have a dataframe containing a column like:

df['metrics'] = [{id=1,name=XYZ,value=3}, {id=2,name=KJH,value=2}] [{id=4,name=ABC,value=7}, {id=8,name=HGS,value=9}]

The column is a String type, and I am trying to explode the column using :

    from pyspark.sql import functions as F
from pyspark.sql.types import ArrayType

array_item_schema = spark.read.json(df.rdd.map(lambda row: row['metrics'])).schema

json_array_schema = ArrayType(array_item_schema, True)

arrays_df = df.select(F.from_json('metrics', json_array_schema).alias('json_arrays'))

objects_df = arrays_df.select(F.explode('json_arrays').alias('objects'))

However, I have a null value returned when I try

objects_df.show()

The output I am looking for is a separated list of each element in the 'metrics' column, with column names showing id, name, value, in the same dataframe, and don't know where to start to decode it. Thanks for the help!

4
  • Check this answer - stackoverflow.com/a/74770833/8773309 Commented Dec 12, 2022 at 14:14
  • @MohanaBC the code shown here throws an 'invalid syntax' in pyspark... Commented Dec 12, 2022 at 16:57
  • that's scala code convert that into python syntax. method names are same in pyspark and spark-scala. Commented Dec 12, 2022 at 17:19
  • I have very little exposure to spark scala, and am lost here. Any help would be appreciated in converting that code! Commented Dec 12, 2022 at 17:56

1 Answer 1

0

You can schema_of_json function to get schema from JSON string and pass it to from_json function get struct type.

  json_array_schema = schema_of_json(str(df.select("metrics").first()[0]))
  arrays_df = df.select(from_json('metrics', json_array_schema).alias('json_arrays'))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.