From the course: Advanced NoSQL for Data Science
Unlock the full course today
Join today to access over 24,900 courses taught by industry experts.
Tips for using document databases for data science - NoSQL Tutorial
From the course: Advanced NoSQL for Data Science
Tips for using document databases for data science
- [Instructor] As we conclude our discussion of document databases, I just want to share a few tips for working with these systems of data science projects. Embedded documents and hierarchical structures are quite useful for denormalizing. They keep related information together in a logical structure. This is helpful when trying to understand the meaning of data within a document. We use embedded documents to avoid having to join data when we query collections. Unfortunately, embedded documents are not a good fit with the tabular structures commonly used in machine learning and statistics. For this reason, it's helpful to flatten your structures when you're loading data for analysis. Feel free to add new features to your collections. We can sometimes improve the quality of our machine learning models by feature engineering. For example, we can create features based on the combination of attributes, such as age and location. Another common practice is normalizing numeric values in a…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
Document data models1m 35s
-
(Locked)
JSON structures1m 53s
-
(Locked)
Prepare data with document databases3m 43s
-
(Locked)
Install Anaconda1m 34s
-
(Locked)
Install MongoDB2m 38s
-
(Locked)
Working with Jupyter2m 43s
-
Explore data with document databases5m 4s
-
(Locked)
Extract data with document databases5m 50s
-
(Locked)
Perform quality checks5m 43s
-
(Locked)
Index data with document databases2m 20s
-
(Locked)
Data frames in MongoDB4m 48s
-
(Locked)
Tips for using document databases for data science2m 6s
-
-
-
-