0

I have a python script to insert a csv file into mongodb collection

import pymongo
import pandas as pd
import json

client = pymongo.MongoClient("mongodb://localhost:27017")

df = pd.read_csv("iris.csv")

data = df.to_dict(oreint = "records")

db = client["Database name"]

db.CollectionName.insert_many(data)

Here all the columns of csv files are getting inserted into mongo collection. How can I achieve a usecase where I want to insert only specific columns of csv file in the mongo collection .

What changes I can make to existing code.

Lets say I also have database already created in my Mongo. Will this command work even if the database is present (db = client["Database name"])

2
  • In Pandas, it's easy to create a new dataframe while selecting the columns you want from another dataframe ... or drop columns from a dataframe. Would that satisfy your needs? Commented May 10, 2022 at 6:04
  • FYI, the rust app xsv is awesome at processing, selecting, formatting, etc., CSV files - and it's fast! Commented May 12, 2022 at 16:11

1 Answer 1

1

Have you checked out pymongoarrow? the latest release has write support where you can import a csv file into mongodb. Here are the release notes and documentation. You can also use mongoimport to import a csv file, documentation is here, but I can't see any way to exclude fields like the way you can with pymongoarrow.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.