Join now Sign in

From the course: Big Data Analytics with Hadoop and Apache Spark

The combined power of Spark and Hadoop Distributed File System (HDFS)

From the course: Big Data Analytics with Hadoop and Apache Spark

Start my 1-month free trial Buy for my team

The combined power of Spark and Hadoop Distributed File System (HDFS)

“

- [Instructor] Using a combination of Hadoop for storage and Spark for compute, provides unparalleled scalability and performance for analytics pipelines. To do this, it's important to understand how Hadoop and Spark work with each other and utilize the levers available. We will only focus on using Hadoop and Spark together in this course. We will use PySpark and Jupyter Notebooks for the examples. My name is Kumaran Ponnambalam. In this course, I will show you how to build scalable and high performance analytics pipelines. Let's explore how to maximize the combined power of Hadoop and Spark.

Contents