0

What is a good way of hashing a hierarchy (similar to a file structure) in python?

I could convert the whole hierarchy into a dotted string and then hash that, but is there a better (or more efficient) way of doing this without going back and forth all the time?

An example of a structure I might want to hash is:

a -> b1 -> c -> 1 -> d
a -> b2 -> c -> 2 -> d
a -> c -> 1 -> d
4
  • By hierarchy, do you mean a single file's list of path components, i.e. ["usr", "local", "test", "myfile"]? Commented Mar 17, 2009 at 13:15
  • Downmodding, question is very unclear and confusing. Commented Mar 17, 2009 at 13:44
  • added an example to make it a little clearer... Commented Mar 17, 2009 at 13:59
  • What does the "->" symbol mean in the example? What actual Python data structure contains the source data that creates your hierarchy? Commented Mar 17, 2009 at 14:52

3 Answers 3

8

If you have access to your hierarchy components as a tuple, just hash it - tuples are hashable. You may not gain a lot over conversion to and from a delimited string, but it's a start.

If this doesn't help, perhaps you could provide more information about how you store the hierarchy/path information.

Sign up to request clarification or add additional context in comments.

1 Comment

+1. python isn't javascript, dictionary keys can be more than just strings. unfortunately, it's not Lua either, where keys can be any value
4

How do you want to access your hierarchy?

If you're always going to be checking for a full path, then as suggested, use a tuple: eg:

>>> d["a","b1","c",1,"d"] = value

However, if you're going to be doing things like "quickly find all the items below "a -> b1", it may make more sense to store it as a nested hashtable (otherwise you must iterate through all items to find those you're intereted in).

For this, a defaultdict is probably the easiest way to store. For example:

from collections import defaultdict

def new_dict(): return defaultdict(new_dict)
d = defaultdict(new_dict)

d["a"]["b1"]["c"][1]["d"] = "test"
d["a"]["b2"]["c"][2]["d"] = "test2"
d["a"]["c"][1]["d"] = "test3"

print d["a"]["c"][1]["d"]  # Prints test3
print d["a"].keys()        # Prints ["c", "b1", "b2"]

Comments

1

You can make any object hashable by implementing the __hash__() method

So you can simply add a suitable __hash__() method to the objects storing your hierarchy, e.g. compute the hash recursively, etc.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.