0

I have two numpy arrays a and b. I have a definition that construct an array c whose elements are all the possible sums of different elements of a.

import numpy as np

def Sumarray(a):
    n = len(a)

    sumarray = np.array([0]) # Add a default zero element
    for k in range(2,n+1):
        full = np.mgrid[k*(slice(n),)]
        nd_triu_idx = full[:,(np.diff(full,axis=0)>0).all(axis=0)]
        sumarray = np.append(sumarray, a[nd_triu_idx].sum(axis=0))

    return sumarray

a = np.array([1,2,6,8])
c = Sumarray(a)
print(d)

I then perform a subsetsum between an element of c and b: isSubsetSum returns the elements of b that when summed gives c[1]. Let's say that I get

c[0] = b[2] + b[3]

Then I want to remove:

  1. the elements b[2], b[3] (easy bit), and
  2. the elements of a that when summed gave c[0]

As you can see from the definition, Sumarray, the order of sums of different elements of a are preserved, so I need to realise some mapping.

The function isSubsetSum is given by

def _isSubsetSum(numbers, n, x, indices):
    if (x == 0):
        return True
    if (n == 0 and x != 0):
        return False
    # If last element is greater than x, then ignore it
    if (numbers[n - 1] > x):
        return _isSubsetSum(numbers, n - 1, x, indices)
    # else, check if x can be obtained by any of the following
    found = _isSubsetSum(numbers, n - 1, x, indices)
    if found: return True
    indices.insert(0, n - 1)
    found = _isSubsetSum(numbers, n - 1, x - numbers[n - 1], indices)
    if not found: indices.pop(0)
    return found

def isSubsetSum(numbers, x):
    indices = []
    found = _isSubsetSum(numbers, len(numbers), x, indices)
    return indices if found else None
3
  • This would actually be much easier using the order from my last comment here stackoverflow.com/q/58404635/7207392 then np.unpackbits(<the_index>.view('u1'),count=len(a),bitorder='little') would give you a mask that is 1 at the positions in a that went into the sum. Commented Oct 16, 2019 at 21:53
  • @PaulPanzer, could you explain that a little more. I don't still get the mapping Commented Oct 16, 2019 at 21:55
  • @PaulPanzer, Could you write it as a solution, since i've been trying to piece together your suggestions from your other post and getting this. Commented Oct 16, 2019 at 22:12

1 Answer 1

1

As you are iterating over all possible numbers of terms, you could as well directly generate all possible subsets.

These can be conveniently encoded as numbers 0,1,2,... by means of their binary representations: O means no terms at all, 1 means only the first term, 2 means only the second, 3 means the first and the second and so on.

Using this scheme it becomes very easy to recover the terms from the sum index because all we need to do is obtain the binary representation:

UPDATE: we can suppress 1-term-sums with a small amount of extra code:

import numpy as np

def find_all_subsums(a,drop_singletons=False):
    n = len(a)
    assert n<=32 # this gives 4G subsets, and we have to cut somewhere
    # compute the smallest integer type with enough bits
    dt = f"<u{1<<((n-1)>>3).bit_length()}"
    # the numbers 0 to 2^n encode all possible subsets of an n
    # element set by means of their binary representation
    # each bit corresponds to one element number k represents the 
    # subset consisting of all elements whose bit is set in k
    rng = np.arange(1<<n,dtype=dt)
    if drop_singletons:
        # one element subsets correspond to powers of two
        rng = np.delete(rng,1<<np.arange(n))
    # np.unpackbits transforms bytes to their binary representation
    # given the a bitvector b we can compute the corresponding subsum
    # as b dot a, to do it in bulk we can mutliply the matrix of 
    # binary rows with a
    return np.unpackbits(rng[...,None].view('u1'),
                         axis=1,count=n,bitorder='little') @ a

def show_terms(a,idx,drop_singletons=False):
    n = len(a)
    if drop_singletons:
        # we must undo the dropping of powers of two to get an index
        # that is easy to translate. One can check that the following
        # formula does the trick
        idx += (idx+idx.bit_length()).bit_length()
        # now we can simply use the binary representation
    return a[np.unpackbits(np.asarray(idx,dtype='<u8')[None].view('u1'),
                           count=n,bitorder='little').view('?')]


example = np.logspace(1,7,7,base=3)
ss = find_all_subsums(example,True)
# check every single sum
for i,s in enumerate(ss):
    assert show_terms(example,i,True).sum() == s
# print one example
idx = 77
print(ss[idx],"="," + ".join(show_terms(example.astype('U'),idx,True)))

Sample run:

2457.0 = 27.0 + 243.0 + 2187.0
Sign up to request clarification or add additional context in comments.

9 Comments

@SId Ah, sorry, I missed that you are starting from 2, not 1. But you do include 0. If it's a big problem it can be fixed but it makes the code more complicated. idx is just a random example index. I do return the elements as an array, the printing happens outside the function.
@Sid, pls check out update. That's all my programming for today.
why doesn't something like show_terms(a,c[1],True) work? I get the AttributeError: 'numpy.int64' object has no attribute 'bit_length'
@SId Could you try c.item(1) instead of c[1]?
@Sid there is a subtle difference between Python ints and numpy ints. Sometimes like here it breaks things.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.