Does the most efficient solution to some problems require mutable data?

Question

I've been dabbling in Haskell - so still very much a beginner.

I'm been thinking about the counting the frequency of items in a list. In languages with mutable data structures, this is typically solved using a hash table - a dict in Python or a HashMap in Java for example. The complexity of such a solution is O(n) - assuming the hash table can fit entirely in memory.

In Haskell, there seem to be two (mainstream) choices - to sort the data then group and count it or use a Data.Map. If a sort is used, it dominates the run-time of the solution, so the complexity is O(n log n). Likewise, Data.Map uses a balanced tree, so inserting n elements into it will also have complexity O(n log n).

If my analysis is correct, then I assume that this particular problem is most efficiently solved by resorting to a mutable data structure. Are there other types of problems where this is also true? How in general do people using Haskell approach something like this?

See Can I always convert mutable-only algorithms to single-assignment and still be efficient? for a guy who was convinced immutability would be a problem, felt he'd found an example that would necessitate exponential time, but I wrote a linear Haskell solution. — AndrewC
– AndrewC, Commented Feb 27, 2014 at 17:37
He means that unless you have more than 2^30 pieces of data (a billion) a factor of log n is essentially a constant. — AndrewC
– AndrewC, Commented Feb 27, 2014 at 17:59
(It's a humourous way of pointing out that constant factors can outweigh logarithmic factors for real problems.) — AndrewC
– AndrewC, Commented Feb 27, 2014 at 18:05

svenningsson · Accepted Answer · 2014-02-28 14:11:52Z

4

The question whether we can implement any algorithm with optimal complexity in a pure language is currently unknown. Nicholas Pippenger has proven that there is a problem that must necessarily have a log(n) penalty in a pure strict language compared to the optimal algorithm. However, there is a followup paper which shows that this problem have an optimal solution in a lazy language. So at the end of the day we really don't know. Though it seems that most people think that there is an inherent log(n) penalty for some problems, even for lazy languages.

edited Feb 28, 2014 at 14:11

answered Feb 27, 2014 at 20:50

svenningsson

4,0811 gold badge27 silver badges32 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

John L Over a year ago

I seem to recall that someone proved it's possible to implement any mutable data structure using laziness and immutability with at most a constant factor penalty. You wouldn't happen to be familiar with this result? (hope I'm not making it up)

svenningsson Over a year ago

I've also heard a similar argument but I don't have any reference and I don't think I've ever seen a paper claiming such result. But intuitively it seems that any imperative algorithm mutating random access memory could be simulated with a state monad using a binary tree to model the memory. That ought to give the log(n) penalty.

John L Over a year ago

Hmm. I suppose that's close enough for many purposes, thanks.

Eric Fredine Over a year ago

What I took from the comments on my post is that even if there is a log n penalty to be paid sometimes, that I probably shouldn't spend too much time worrying about it.

svenningsson Over a year ago

Indeed. For most practical applications Haskell should be fast enough for you.

Collectives™ on Stack Overflow

Does the most efficient solution to some problems require mutable data?

1 Answer 1

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related