From the course: Microsoft Fabric Analytics Engineer Associate (DP-600) Cert Prep by Microsoft Press

Unlock this course with a free trial

Join today to access over 24,900 courses taught by industry experts.

Lab: Prepare the data using PySpark

Lab: Prepare the data using PySpark

- [Instructor] Now that we have had a detailed discussion about PySpark, let's move on to our Fabric environment and do some hands-on. Before I start with the hands-on, let me just give you a walkthrough of the GitHub link that we will use to download the data for this particular hands-on. So this is the link that you can go to. And then once the data is ready, we will first open the notebook and mark down a cell. This is the text that we will write in the markdown cell. Once the markdown cell is created, we will run some codes. The first code that we will run would be to load the Quarter1.csv data. Once the data is loaded, we will use this code to add headers like registration number, quarter, brand, name, address, and telephone number to the data columns that we have had. Once the headers are added to the columns, we will go ahead and run few more queries and see its performance. Once when we have seen how PySpark code…

Contents