Is there any message digest algorithm that you can apply set functions on the digest and the result still makes sense? In other words, is there a hash function that does NOT break the concept of "set" before and after hashing?
I'm looking for a hash function that:
- hashes a set of data into a fixed-length (or bounded-length) string
- produces identical hash if the input data set is the same
- if you select a subset of your raw data, it is equivalent to either hash the data subset, or apply the subset to the hash of the original data set, i.e. you will get the same subset hash in the both ways.
As an example, in the following picture set A has several data points (red dimonds). B is a subset of A. Is there such a hash function that:
data in A ---- hash function ----> _hashA ---- set operation ----> _hashB
data in B ---- hash function ----> _hashB
