From the course: Microsoft Fabric Analytics Engineer Associate (DP-600) Cert Prep by Microsoft Press
Unlock this course with a free trial
Join today to access over 24,900 courses taught by industry experts.
Lab: Prepare the data using PySpark - Microsoft Fabric Tutorial
From the course: Microsoft Fabric Analytics Engineer Associate (DP-600) Cert Prep by Microsoft Press
Lab: Prepare the data using PySpark
- [Instructor] Now that we have had a detailed discussion about PySpark, let's move on to our Fabric environment and do some hands-on. Before I start with the hands-on, let me just give you a walkthrough of the GitHub link that we will use to download the data for this particular hands-on. So this is the link that you can go to. And then once the data is ready, we will first open the notebook and mark down a cell. This is the text that we will write in the markdown cell. Once the markdown cell is created, we will run some codes. The first code that we will run would be to load the Quarter1.csv data. Once the data is loaded, we will use this code to add headers like registration number, quarter, brand, name, address, and telephone number to the data columns that we have had. Once the headers are added to the columns, we will go ahead and run few more queries and see its performance. Once when we have seen how PySpark code…
Contents
-
-
-
-
-
(Locked)
Introduction to lakehouse architecture in Microsoft Fabric7m 3s
-
(Locked)
Store and manage semi-structured data in lakehouses1m 26s
-
(Locked)
Work with Delta Lake tables for efficient data management2m 28s
-
(Locked)
Use PySpark for data transformation and analysis3m 31s
-
(Locked)
Lab: Prepare the data using PySpark17m 15s
-
(Locked)
-
-
-
-
-