0

I have two big numpy arrays or pandas dataframes, eg:

a=[[1, 10, 20, 30],[2, 50, 14, -10],[3, 11, 2, 0], ...] 

b=[[10, 40, 30, 1, 1, 2],[0, 11, -1, 32, 3, 2],[9, 2, 51, -2, 3, 2], ...]

I want to replace last two columns of the matrix b with values of a. I want to say when in the last two columns of a, we have 1, replace with the row in the a which contains 1 as the first column of a. this column is a counter from 1 to end. In fact at the end the columns of matrix b will be increased from 6 to 10.

So, the new b will be something like:

b=[[10, 40, 30, 1, 10, 20, 30, 50, 14, -10],[0, 11, -1, 32, 11, 2, 0, 50, 14, -10],[9, 2, 51, -2, 10, 20, 30, 11, 2, 0], ...]

I appreciate any solution to handle this request with the data either as numpy arrays or pandas.

3
  • Could you phrase your question better? I cannot understand what you are trying to do... Commented Sep 11, 2020 at 13:42
  • In fact, I want to replace the last two columns of a matrix (b), with values of another matrix (a) based on the values exisintg in b. For example when we see 1 in one of the two last columns, the algorithm should replace that 1 with the values exisinth in the first row of matrix a. These two columns of matrix b are representing counters and the attributes of these counters are stored in matrix a. Now, I want to replace the counter number with attribute of each number. Commented Sep 11, 2020 at 13:53
  • Problems of replacement look quite complicated for me at this moment. It would be much easier last two columns of b were indices of these 1 * 3 blocks to be replaced with. Commented Sep 11, 2020 at 13:57

2 Answers 2

1

Assuming first column of a is of the form [1, 2, 3...] it can be done with this one-liner:

np.c_[b[:,:-2], a[b[:,-2]-1, 1:], a[b[:,-1]-1, 1:]]

In fact, this is more convenient to replace a with a[:, 1:], it can be simplified then like so:

np.c_[b[:,:-2], a[b[:,-2]-1], a[b[:,-1]-1]]

The last two columns of b were converted to indices of a. In case first column of a is different than [1, 2, 3...], subtracting one is not enough and you need to think of different way how to map last two columns of b to indices with respect to a. I leave it out of scope.

Sign up to request clarification or add additional context in comments.

Comments

0

Two suggestions.

  1. If these are in pandas dataframes, you can join the 'a' dataframe to the 'b' dataframe twice, based on column b.5 = a0.1 and b.6 = a1.1. Then read off the columns you need (b.1-4, a0.2-4, a1.2-4. Something like:

    new1 = pd.merge(b, a, left_on='5', right_on='1')
    new2 = pd.merge(new1, a, left_on='6', right_on='1')
    

Then drop columns 5 and 6

  1. otherwise would suggest turning 'a' to a different structure, a list of tuples or a dictionary. Your index is embedded as the first value, so if you went the dictionary rout you would try to get {1:[10, 20, 30], 2:[50, 14, -10], 3:[11, 2, 0] ... } and that makes the lookup easier.

    newlist = []
    for x in b:
        q = x[:4]
        q.extend(a[x[4]])
        q.extend(a[x[5]])
        newlist.append(q)
    

2 Comments

Thanks for your hints. I really appreciate it. I tried the fisrt solution but but was giving me only an empty pandas datadfarme with 17 columns.
I'd need to see the code. I don't know your column names, so you need to put them in the left_on and right_on fields.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.