2

Assuming a dataframe, df, using pandas in the size of n x m.

I would like to perfom linear algebra operation on df.

Until now, I was unable to find a way to perfom directly linear algebra on df. What i was able to find is how to convert df from pandas format to Numpy using:

A = DataFrame.as_matrix

then I can simpy do

linalg.inv(A)

Is there a direct way of performing linear operation in Scipy using pandas dataframe? for example:

linalg.inv(df)

The reason I would like to use the linear algebra operation from scipy instead of Numpy is based on:

In any case, SciPy contains more fully-featured versions of the linear algebra modules, as well as many other numerical algorithms. If you are doing scientific computing with python, you should probably install both NumPy and SciPy. Most new features belong in SciPy rather than NumPy.

from What-is-the-difference-between-NumPy-and-SciPy

1
  • 1
    Both pandas and scipy are built on numpy. Most scipy code assumes inputs are arrays, or can be converted to such. scipy inv converts the input to a numpy array (with np.asarray). If a dataframe works in a scipy function it's because it can be converted to an array. Commented May 30, 2017 at 16:22

1 Answer 1

2

You can directly use it on your DataFrames.

Demo:

In [111]: from scipy.linalg import inv

In [112]: df = pd.DataFrame(np.random.rand(5,5), columns=list('abcde'))

In [113]: df
Out[113]:
          a         b         c         d         e
0  0.619086  0.229390  0.361611  0.857177  0.274983
1  0.389630  0.689562  0.687043  0.388781  0.781168
2  0.702920  0.253870  0.881173  0.858378  0.363035
3  0.007022  0.571111  0.408729  0.708862  0.042882
4  0.876747  0.170775  0.499824  0.929295  0.762971

In [114]: inv(df)
Out[114]:
array([[ 5.67652746,  1.54854922, -0.21927114, -3.04884324, -3.35567433],
       [ 4.32996215,  1.99787442, -1.18579234, -0.9802008 , -2.98677673],
       [-2.43833426, -0.29287732,  2.11691208,  0.34655505,  0.1519223 ],
       [-1.92398165, -1.43903773, -0.22722582,  1.96404685,  2.16451337],
       [-3.55144126, -0.28205091, -0.59264783,  1.10366465,  3.09938364]])

PS i used Pandas 0.19.2 and SciPy 0.18.1 for this demo.

UPDATE: if you want to get a DataFrame as a result:

In [4]: pd.DataFrame(inv(df), columns=df.columns, index=df.index)
Out[4]:
          a         b         c         d         e
0  5.676507  1.548541 -0.219275 -3.048828 -3.355657
1  4.329938  1.997865 -1.185791 -0.980187 -2.986760
2 -2.438323 -0.292872  2.116913  0.346547  0.151914
3 -1.923971 -1.439034 -0.227226  1.964040  2.164506
4 -3.551428 -0.282045 -0.592647  1.103655  3.099373
Sign up to request clarification or add additional context in comments.

2 Comments

once i do inv(df) in your example, i get Numpy array, correct?
@Eagle, i can't say whether it's correct or not... The question is - what do you want to achieve? ;-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.