0

I wrote a python program that processes a large amount of text and puts the data into its multi-level dictionary. As a result, dictionary gets really big (2GBs or above), eats up memory, and leads to slowness/memory error.

So I'm looking to use sqlite3 instead of putting the data in python dict.

But come to think of it, the entire db from sqlite3 will have to be accessible throughout the running of the program. So in the end, wouldn't it lead to the same result where the memory is eaten up by the large db?

Sorry my understanding of memory is a little clumsy. I want to make things clear before I bother to port my program to using db.

Thank you in advance.

5
  • You're the only one who could possibly know, since it depends on the inherent logic of your application. But in most cases, keeping just the index(es) in memory should suffice. Commented May 18, 2013 at 11:12
  • @DekDekku Is that handled automatically by the db module or do I need to specify to keep just the indexes? Commented May 18, 2013 at 11:14
  • No idea for SQLite, you should check the docs, but it should be automatic: it's the way it usually works. Indexes are useless if they're on disk, while data is usually just too big to be kept in memory in its entirety. Commented May 18, 2013 at 11:22
  • SQLite manages memory fine; it'll store data in temporary files as needed before committing. Commented May 18, 2013 at 11:28
  • The amount of memory used by Python to store small integers or strings is huge compared to the actual information content. Instead of moving to a DB, which will probably store data with less overhead, you could also try to do that in-memory yourself, using e.g. the array module. Commented May 18, 2013 at 12:52

1 Answer 1

1

SQLite creates temporary files as needed to store data not yet committed to a database. It will do so even for in-memory databases.

As such, 2GB of data will not eat up all your memory when stored in a SQLite database.

Sign up to request clarification or add additional context in comments.

2 Comments

@MartjinPieters thank you. Will it be okay for say tens of gigabytes though?
@CosmicRabbitMediaInc: As another comment remarked; python objects have more overhead than data stored in C structures, I'd expect data stored in SQLite to be more compact. SQLite can handle 140 terrabytes of data.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.