I have a file that contains one line
[[1],[2,3]]
I think this is a valid json file and I want to read it in Spark, so I tried
df = spark.read.json('file:/home/spark/testSparkJson.json')
df.head()
Row(_corrupt_record=u'[[1],[2,3]]')
It seems to me that Spark failed to parse this file, and I want Spark to read it as Array of Array of Long in a column, so that I can have
df.head()
Row(sequence=[[1], [2, 3]])
df.printSchema()
root
|-- sequence: array (nullable = true)
| |-- element: array (containsNull = true)
| | |-- element: long (containsNull = true)
how can I do this?
I'm using pyspark in Spark 2.1.0 now, any solution base on other language/previous versions are also welcome.