2

I have a program that counts word occurrences in a text file and stores them in an array. So far I'm using a fixed array and everything works fine but now I'd like to change that to a dynamic array so there is never any memory wasted/required. I understand that malloc and realloc must be used to accomplish this but I don't really understand how to go about doing it.

My first idea was to simply count the words in the text file then malloc enough space for all of them but this will leave wasted space as duplicate words will have a counter increased, but not be added to the array again.

Does this approach sound like it makes sense and would be the best way to go about accomplishing it? If I first malloc a small array just enough to find one word and its counter. Then each time I find a new word that needs adding to the array just realloc enough to fit another word and counter in. If it's a duplicate no realloc will be needed as an existing counter will just be incremented.

1
  • Another solution to your problem, which again has nothing to do with dynamically extending an array, is given in K&R's "The C Programming Language" (full implementation, explained in detail). They use a binary tree to record words and their frequency. This not only has an optimal memory use for the word counting problem, but is also efficient and is a very clean solution. Anyway, if you are programming in C reading this book will help you out a lot. Commented Apr 11, 2013 at 6:52

1 Answer 1

5

It's generally best (in terms of trading speed vs memory usage) to not aim for 100% memory utilization; especially if your program only runs for a limited amount of time, using a bit more memory than needed really doesn't "cost" a lot, overall.

One typical approach is to make the dynamic array have an initial size, say 8 or 128 or whatever, then double it whenever it fills up.

This reduces the number of re-allocations (which are costly) compared to just incrementing the size by 1 when it fills up. Of course, it wastes some memory.

Sign up to request clarification or add additional context in comments.

1 Comment

It's for a project we've been given and one of the aims is to "utilize efficient memory allocation". I have read about quite a few people recommending the doubling up approach. So before I "add a new word" I'd have to check if the array is full, if it is realloc if not carry on? It's an array of structs so I'm not really sure how to go about the memory allocation in the first place.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.