From the course: Big Data Analytics with Hadoop and Apache Spark
The combined power of Spark and Hadoop Distributed File System (HDFS)
From the course: Big Data Analytics with Hadoop and Apache Spark
The combined power of Spark and Hadoop Distributed File System (HDFS)
- [Instructor] Using a combination of Hadoop for storage and Spark for compute, provides unparalleled scalability and performance for analytics pipelines. To do this, it's important to understand how Hadoop and Spark work with each other and utilize the levers available. We will only focus on using Hadoop and Spark together in this course. We will use PySpark and Jupyter Notebooks for the examples. My name is Kumaran Ponnambalam. In this course, I will show you how to build scalable and high performance analytics pipelines. Let's explore how to maximize the combined power of Hadoop and Spark.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.