How to set pythonpath at startup when running jupyter notebook server

I am running a Jupyter notebook public server as in this tutorial : http://jupyter-notebook.readthedocs.io/en/stable/public_server.html

I want to use pyspark-2.2.1 with this server. I pip-installed py4j and downloaded spark-2.2.1 from the repository.

Locally, i added in my .bashrc the command lines

export SPARK_HOME='/home/ubuntu/spark-2.2.1-bin-hadoop2.7'  
export PATH=$SPARK_HOME:$PATH  
export PYTHONPATH=$SPARK_HOME/python:$PYTHONPATH

and everything works fine when i run python locally.

However, when using the notebook server, i cannot import pyspark, because the above commands have not been executed at jupyter notebook's startup.

I partly (and non elegantly) solved the issue by typing

import sys
sys.path.append("/home/ubuntu/spark-2.2.1-bin-hadoop2.7/python")

in the first cell of my notebook. But

from pyspark import SparkContext
sc = SparkContext()
myrdd = sc.textFile('exemple.txt')
myrdd.collect()  # Everything works find util here
words = myrdd.map(lambda x:x.split())
words.collect()

returns the error

Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): java.io.IOException: Cannot run program "python": error=2, No such file or directory

Any idea how i can set the correct paths (either manually or at startup) ?

Thanks

asked Feb 11, 2018 at 10:00

Salva Martini

212 bronze badges

Use a Jupyter kernel; see detailed answer in Configuring Spark to work with Jupyter Notebook and Anaconda

desertnaut
– desertnaut

2018-02-11 10:31:39 +00:00
Commented Feb 11, 2018 at 10:31
1

Possible duplicate of Configuring Spark to work with Jupyter Notebook and Anaconda

desertnaut
– desertnaut

2018-02-11 10:32:30 +00:00
Commented Feb 11, 2018 at 10:32
Thank you. Finally solved the issue by adding the following line to my /etc/systemd/system/jupyter.service : ExecStart=/bin/bash -c "PATH=[all paths i need]"

Salva Martini
– Salva Martini

2018-02-13 10:06:46 +00:00
Commented Feb 13, 2018 at 10:06

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

How to set pythonpath at startup when running jupyter notebook server

0

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Linked