From the course: Advanced NoSQL for Data Science

Unlock the full course today

Join today to access over 24,900 courses taught by industry experts.

Tips for using document databases for data science

Tips for using document databases for data science - NoSQL Tutorial

From the course: Advanced NoSQL for Data Science

Tips for using document databases for data science

- [Instructor] As we conclude our discussion of document databases, I just want to share a few tips for working with these systems of data science projects. Embedded documents and hierarchical structures are quite useful for denormalizing. They keep related information together in a logical structure. This is helpful when trying to understand the meaning of data within a document. We use embedded documents to avoid having to join data when we query collections. Unfortunately, embedded documents are not a good fit with the tabular structures commonly used in machine learning and statistics. For this reason, it's helpful to flatten your structures when you're loading data for analysis. Feel free to add new features to your collections. We can sometimes improve the quality of our machine learning models by feature engineering. For example, we can create features based on the combination of attributes, such as age and location. Another common practice is normalizing numeric values in a…

Contents