Specifying python files for jupyter notebook on a Spark cluster

Question

I am running a jupyter-notebook on a Spark cluster (with yarn). I am using the "findspark" package to set up the notebook and it works perfectly fine (I connect to the cluster master through a SSH tunnel). When I write a "self-contained" notebook, it works perfectly, e.g. the following code runs with no problem:

import findspark
findspark.init()

import pyspark

sc = pyspark.SparkContext(appName='myApp')
a = sc.range(1000,numSlices=10)
a.take(10)
sc.stop()

The Spark job is perfectly distributed on the workers. However, when I want to use a python package that I wrote, the files are missing on the workers.

When I am not using Jupyter-notebook and when I use spark-submit --master yarn --py-files myPackageSrcFiles.zip, my Spark job works fine, e.g. the following code runs correctly:

main.py

import pyspark
from myPackage import myFunc

sc = pyspark.SparkContext(appName='myApp')
a = sc.range(1000,numSlices=10)
b = a.map(lambda x: myFunc(x)) 
b.take(10)
sc.stop()

Then

spark-submit --master yarn --py-files myPackageSrcFiles.zip main.py

The question is: How to run main.py from a jupyter notebook? I tried specifying the .zip package in the SparkContext with the pyfiles keyword but I got an error...

Alper t. Turker · Accepted Answer · 2017-12-08 19:52:19Z

6

I tried specifying the .zip package in the SparkContext with the pyfiles keyword but I got an error

It is camel case:

sc = pyspark.SparkContext(appName='myApp', pyFiles=["myPackageSrcFiles.zip"])

Or you can addPyFile

sc.addPyFile("myPackageSrcFiles.zip")

answered Dec 8, 2017 at 19:52

Alper t. Turker

35.3k9 gold badges89 silver badges118 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Rishan Over a year ago

Is it possible to do this with spark session builder?

Collectives™ on Stack Overflow

Specifying python files for jupyter notebook on a Spark cluster

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related