2

I want to run an R script in python using rpy2, I already know how to do this

The R code is:

dataR = data.frame( Ingresos = c(23,45,24,23,54),
                    Bonos = c(23,45,12,67,54),
                    Deuda = c(23,4,1,6,3),
                    row.names = c("Nathy", "Tomas", "Joe", "Emily", "Javi") )
dataR
promedio_ingresos = mean(dataR$Ingresos)
Max_Ing = sort(dataR$Ingresos[dataR$Ingresos>promedio_ingresos])
Max_Ing

To run this R script in python I use:

import rpy2
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
r = robjects.r
output = r.source("R_script_run_in_python.R")
output

And output gets the last value from my R code

Now I want to run the same code, but using a data that I define in python, for example:

import pandas as pd
df = pd.DataFrame( np.random.randn(5,3), 
                   columns = ["Ingresos","Bonos","Deuda"], 
                   index = ["Max", "Nathy", "Tom", "Joe", "Kathy"] )

So the R code I want tu run now is just:

promedio_ingresos = mean(dataR$Ingresos)
Max_Ing = sort(dataR$Ingresos[dataR$Ingresos>promedio_ingresos])
Max_Ing

But dataR being df, how can I do that?

2
  • 1
    Curious why do you need to run in R what can easily be run in pandas? Usually rpy2 is used for specialized modules/libraries available in one and not the other. Commented Dec 4, 2018 at 21:03
  • 2
    @Parfait I have some code in R and python, now we are migrating all to python, but there are really nice libraries and code developed in R that I want to take advange, for example the scorecard library from R. Also I would not expend to much time migrating some code from R to python Commented Dec 4, 2018 at 22:57

1 Answer 1

2

I tried this and it worked

# Data    
# Pandas dataframe
df = pd.DataFrame( np.random.randn(5,3),
                   columns = ["Ingresos","Bonos","Deuda"],
                   index = ["Max", "Nathy", "Tom", "Joe", "Kathy"] )   
# rpy2 datframe
dataR = pandas2ri.py2ri(df)

# R code
robjects.globalenv["dataR"] = dataR
robjects.r('''
           promedio_ingresos = mean(dataR$Ingresos)
           Max_Ing = sort(dataR$Ingresos[dataR$Ingresos>promedio_ingresos])
''')
print(robjects.globalenv["dataR"])
print(robjects.globalenv["promedio_ingresos"])
print(robjects.globalenv["Max_Ing"])
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.