From the course: Microsoft Azure Data Scientist Associate (DP-100) Cert Prep

Unlock this course with a free trial

Join today to access over 24,900 courses taught by industry experts.

Load and transform data

Load and transform data

- [Instructor] Here's a diagram of a typical operation in the cloud for loading and transforming data. You can see on the left here, we have a cloud database. We have cloud storage, maybe other data sources as well. And you load that into a cluster and then you do some kind of operation on that cluster. So in this record set A, we have one, two, three columns. In record set B, I have four, five, six column. I could combine those columns, so that I could operate on them using some tool like, for example, Jupyter Notebook, if I wanted to do exploratory data analysis or machine learning. And then finally, once I've got that operation done, I could then put the artifact of the operation, let's say a model or a new CSV file into a new location. So this is a process that is repeatable and works with multiple types of data storage, including storage databases, potentially, a managed service like Databricks. All of them are part of the load and transform data operation.

Contents