Breaking down numpy array into smaller arrays of same value [Python]

Question

I have the following numpy array:

array=[1,1,1,1,2,2,3,3,3,5,6,6,6,6,6,6,7]

I need to break this array into smaller arrays of same values such as

[1,1,1,1] and [3,3,3]

My code for this is as follows but it doesn't work:

def chunker(seq, size):
    return (seq[pos:pos + size] for pos in range(0, len(seq)-size))
counter=0
sub_arr=[]
arr=[]
for i in range(len(array)):
    if(array[i]==array[i+1]):
        counter+=1
    else:
        break
        subarr=chunker(array,counter)
    arr.append(sub_arr)
    array=array[counter:]

what is an efficient to break down the array into smaller arrays of equal/same values?

Do you expect/care about arrays like [1,1,2,2,1,1]? Do you care about the order of the subarrays in the list? (Should it match the original order?) — DYZ
– DYZ, Commented Jul 10, 2018 at 7:37

Mr. T · Accepted Answer · 2018-07-10 07:27:35Z

3

A numpy solution for floats and integers:

import numpy as np
a = np.asarray([1,1,1,1,2,2,3,3,3,5,6,6,6,6,6,6,7])
#calculate differences between neighbouring elements and get index where element changes
#sample output for index would be [ 4  6  9 10 16]
index = np.where(np.diff(a) != 0)[0] + 1
#separate arrays
print(np.split(a, index))

Sample output:

[array([1, 1, 1, 1]), array([2, 2]), array([3, 3, 3]), array([5]), array([6, 6, 6, 6, 6, 6]), array([7])]

If you had strings, this method naturally wouldn't work. Then you should go with DyZ's itertools approach.

edited Jul 10, 2018 at 7:27

answered Jul 10, 2018 at 5:09

Mr. T

12.5k10 gold badges39 silver badges67 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

DYZ · Accepted Answer · 2018-07-10 03:05:14Z

2

NumPy has poor support for such grouping. I suggest using itertools that operate on lists.

from itertools import groupby
[np.array(list(data)) for _,data in itertools.groupby(array)]
#[array([1, 1, 1, 1]), array([2, 2]), array([3, 3, 3]), \
# array([5]), array([6, 6, 6, 6, 6, 6]), array([7])]

This is not necessarily the most efficient method, because it involves converstions to and from lists.

answered Jul 10, 2018 at 3:05

DYZ

57.3k10 gold badges73 silver badges101 bronze badges

Comments

andrew_reece · Accepted Answer · 2018-07-10 04:12:16Z

0

Here's an approach using Pandas:

import pandas as pd 

(pd.Series(array)
   .value_counts()
   .reset_index()
   .apply(lambda x: [x["index"]] * x[0], axis=1))

Explanation:
First, convert array to a Series, and use value_counts() to get a count of each unique entry:

counts = pd.Series(array).value_counts().reset_index()
   index  0
0      6  6
1      1  4
2      3  3
3      2  2
4      7  1
5      5  1

Then recreate each repeated-element list, using apply():

counts.apply(lambda x: [x["index"]] * x[0], axis=1)

0    [6, 6, 6, 6, 6, 6]
1          [1, 1, 1, 1]
2             [3, 3, 3]
3                [2, 2]
4                   [7]
5                   [5]
dtype: object

You can use the .values property to convert from a Series of lists to a list of lists, if needed.

edited Jul 10, 2018 at 4:12

answered Jul 10, 2018 at 4:06

andrew_reece

21.4k3 gold badges40 silver badges64 bronze badges

4 Comments

DYZ Over a year ago

This works only if a range of values is not repeated (as in [1,1,2,2,1,1]). And it does not preserve the order.

andrew_reece Over a year ago

@DyZ not sure I'm tracking - what do you mean it only works if a range of values isn't repeated? OP asked for "smaller arrays of same values". My approach correctly groups a list of 1s and a list of 2s in your example. OP also doesn't ask for order preserved as far as I can tell. Can you clarify?

DYZ Over a year ago

When I apply your method to my example, I get two ranges: [1,1,1,1] and [2,2] - instead of three ranges [1,1], [2,2], and [1,1]. I am not sure how important it is for the OP, just making the point.

andrew_reece Over a year ago

Ah ok - I interpreted OP's request as needing two ranges, not three, in your example. Thanks for clarifying!

Collectives™ on Stack Overflow

Breaking down numpy array into smaller arrays of same value [Python]

3 Answers 3

Comments

Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related