9

Let's say I have an class called Star which has an attribute color. I can get color with star.color.

But what if I have a NumPy array of these Star objects. What is the preferred way of getting an array of the colors?

I can do it with

colors = np.array([s.color for s in stars])

But is this the best way to do it? Would be great if I could just do colors = star.color or colors = star->color etc like in some other languages. Is there an easy way of doing this in numpy?

1

3 Answers 3

9

The closest thing to what you want is to use a recarray instead of an ndarray of Python objects:

num_stars = 10
dtype = numpy.dtype([('x', float), ('y', float), ('colour', float)])
a = numpy.recarray(num_stars, dtype=dtype)
a.colour = numpy.arange(num_stars)
print a.colour

prints

[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9.]

Using a NumPy array of Python objects usually is less efficient than using a plain list, while a recarray stores the data in a more efficient format.

Sign up to request clarification or add additional context in comments.

6 Comments

Cool. So it makes them just like IDL arrays of structures which is what I wanted. How do I use this if I already have a regular python Class defined? Is there a simple way to do that?
@Dave31415: IDL? So you are an astronomer, or is anybody outside astronomy really using this? As to your question: Without seeing the class definition, this is a bit hard to answer. Using NumPy, you generally don't want "methods" operating on single records, but rather functions that can operate on the whole array at once. So you'd need to vectorise your methods.
Trying to be an ex-astronomer. So I guess what you are saying is that arrays of objects is not the preferred data structure for numpy. But then what is? I can make Classes whose attributes are numpy arrays. Is that the better way? It doesn't sound like what I want.
@Dave31415: I'm confused now. What I said is that using a NumPy recarray is preferred over an NumPy array of Python objects, at least when using NumPy at all. If a plain list of Star instances does the job for you, you might as well go with a plain list. Again, it is hard to give advice without knowing more about your use case.
Well to start, my classes might not have method. They are basically just like "structures" in C or python. If so, it seems that recarray would work well for this, right? In this case, do you even bother defining classes or go directly to defining the dtype?
|
4

You could use numpy.fromiter(s.color for s in stars) (note lack of square brackets). That will avoid creating the intermediate list, which I imagine you might care about if you are using numpy.

(Thanks to @SvenMarnach and @DSM for their corrections below).

3 Comments

Unfortunately that won't work: you'll get something like array(<generator object <genexpr> at 0x9cff34c>, dtype=object). (I once had a bug in my code that was ultimately due to the fact I thought this would work.)
You'd need to use numpy.fromiter() for this.
Note: to get that to work in recent numpys, you need numpy.fromiter((s.color for s in stars), float). Also, adding count=len(stars) will make it more efficient for long arrays.
0

In case star is a more complicated class, here is an approach to get and set the attributes with a helper class on top.

import numpy as np

class star:
    def __init__(self, mass=1, radius=1):
        self.mass = mass
        self.radius = radius

class Stars(list):

    __getattr__ = lambda self, attr: np.array([getattr(s, attr) for s in self])

    def __setattr__(self, attr, vals):
        if hasattr(vals, '__len__'):
            [s.__setattr__(attr, val) for (s,val) in zip(self,vals)]
        else:
            [s.__setattr__(attr, vals) for s in self]


s1 = star(1, 1.1)
s2 = star(2, 3)

S = Stars([s1, s2])

print(S.mass)
print(S.radius)

S.density = S.mass / S.radius**3
print(S.density)
print(s1.density)

Of course, if the class can be reimplemented into a recarray, it should be more efficient. Yet, such a reimplementaion might be undesirable.

Note, outer computations, like the density calculation, are still vectorised. And often those could be bottleneck, rather than setting and getting attributes.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.