Note
The rmagic extension has been moved to rpy2 as rpy2.interactive.ipython.
Magic command interface for interactive work with R via rpy2
Note
The rpy2 package needs to be installed separately. It can be obtained using easy_install or pip.
You will also need a working copy of R.
To enable the magics below, execute %load_ext rmagic.
%R
%R [-i INPUT] [-o OUTPUT] [-w WIDTH] [-h HEIGHT] [-d DATAFRAME]
[-u {px,in,cm,mm}] [-r RES] [-p POINTSIZE] [-b BG] [-n]
[code [code ...]]
Execute code in R, and pull some of the results back into the Python namespace.
In line mode, this will evaluate an expression and convert the returned value to a Python object. The return value is determined by rpy2’s behaviour of returning the result of evaluating the final line.
Multiple R lines can be executed by joining them with semicolons:
In [9]: %R X=c(1,4,5,7); sd(X); mean(X)
Out[9]: array([ 4.25])
In cell mode, this will run a block of R code. The resulting value is printed if it would printed be when evaluating the same code within a standard R REPL.
Nothing is returned to python by default in cell mode:
In [10]: %%R
....: Y = c(2,4,3,9)
....: summary(lm(Y~X))
Call:
lm(formula = Y ~ X)
Residuals:
1 2 3 4
0.88 -0.24 -2.28 1.64
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0800 2.3000 0.035 0.975
X 1.0400 0.4822 2.157 0.164
Residual standard error: 2.088 on 2 degrees of freedom
Multiple R-squared: 0.6993,Adjusted R-squared: 0.549
F-statistic: 4.651 on 1 and 2 DF, p-value: 0.1638
In the notebook, plots are published as the output of the cell:
%R plot(X, Y)
will create a scatter plot of X bs Y.
If cell is not None and line has some R code, it is prepended to the R code in cell.
Objects can be passed back and forth between rpy2 and python via the -i -o flags in line:
In [14]: Z = np.array([1,4,5,10])
In [15]: %R -i Z mean(Z)
Out[15]: array([ 5.])
In [16]: %R -o W W=Z*mean(Z)
Out[16]: array([ 5., 20., 25., 50.])
In [17]: W
Out[17]: array([ 5., 20., 25., 50.])
The return value is determined by these rules:
The –dataframe argument will attempt to return structured arrays. This is useful for dataframes with mixed data types. Note also that for a data.frame, if it is returned as an ndarray, it is transposed:
In [18]: dtype=[('x', '<i4'), ('y', '<f8'), ('z', '|S1')]
In [19]: datapy = np.array([(1, 2.9, 'a'), (2, 3.5, 'b'), (3, 2.1, 'c'), (4, 5, 'e')], dtype=dtype)
In [20]: %%R -o datar
datar = datapy
....:
In [21]: datar
Out[21]:
array([['1', '2', '3', '4'],
['2', '3', '2', '5'],
['a', 'b', 'c', 'e']],
dtype='|S1')
In [22]: %%R -d datar
datar = datapy
....:
In [23]: datar
Out[23]:
array([(1, 2.9, 'a'), (2, 3.5, 'b'), (3, 2.1, 'c'), (4, 5.0, 'e')],
dtype=[('x', '<i4'), ('y', '<f8'), ('z', '|S1')])
The –dataframe argument first tries colnames, then names. If both are NULL, it returns an ndarray (i.e. unstructured):
In [1]: %R mydata=c(4,6,8.3); NULL
In [2]: %R -d mydata
In [3]: mydata
Out[3]: array([ 4. , 6. , 8.3])
In [4]: %R names(mydata) = c('a','b','c'); NULL
In [5]: %R -d mydata
In [6]: mydata
Out[6]:
array((4.0, 6.0, 8.3),
dtype=[('a', '<f8'), ('b', '<f8'), ('c', '<f8')])
In [7]: %R -o mydata
In [8]: mydata
Out[8]: array([ 4. , 6. , 8.3])
-i INPUT, --input INPUT | |
Names of input variable from shell.user_ns to be assigned to R variables of the same names after calling self.pyconverter. Multiple names can be passed separated only by commas with no whitespace. | |
-o OUTPUT, --output OUTPUT | |
Names of variables to be pushed from rpy2 to shell.user_ns after executing cell body and applying self.Rconverter. Multiple names can be passed separated only by commas with no whitespace. | |
-w WIDTH, --width WIDTH | |
Width of png plotting device sent as an argument to png in R. | |
-h HEIGHT, --height HEIGHT | |
Height of png plotting device sent as an argument to png in R. | |
-d DATAFRAME, --dataframe DATAFRAME | |
Convert these objects to data.frames and return as structured arrays. | |
-u <{px,in,cm,mm}>, --units <{px,in,cm,mm}> | |
Units of png plotting device sent as an argument to png in R. One of [“px”, “in”, “cm”, “mm”]. | |
-r RES, --res RES | |
Resolution of png plotting device sent as an argument to png in R. Defaults to 72 if units is one of [“in”, “cm”, “mm”]. | |
-p POINTSIZE, --pointsize POINTSIZE | |
Pointsize of png plotting device sent as an argument to png in R. | |
-b BG, --bg BG | Background of png plotting device sent as an argument to png in R. |
-n, --noreturn | Force the magic to not return anything. |
%Rpush
A line-level magic for R that pushes variables from python to rpy2. The line should be made up of whitespace separated variable names in the IPython namespace:
In [7]: import numpy as np
In [8]: X = np.array([4.5,6.3,7.9])
In [9]: X.mean()
Out[9]: 6.2333333333333343
In [10]: %Rpush X
In [11]: %R mean(X)
Out[11]: array([ 6.23333333])
%Rpull
%Rpull [-d] [outputs [outputs ...]]
A line-level magic for R that pulls variables from python to rpy2:
In [18]: _ = %R x = c(3,4,6.7); y = c(4,6,7); z = c('a',3,4)
In [19]: %Rpull x y z
In [20]: x
Out[20]: array([ 3. , 4. , 6.7])
In [21]: y
Out[21]: array([ 4., 6., 7.])
In [22]: z
Out[22]:
array(['a', '3', '4'],
dtype='|S1')
If –as_dataframe, then each object is returned as a structured array after first passed through “as.data.frame” in R before being calling self.Rconverter. This is useful when a structured array is desired as output, or when the object in R has mixed data types. See the %%R docstring for more examples.
Beware that R names can have ‘.’ so this is not fool proof. To avoid this, don’t name your R objects with ‘.’s...
-d, --as_dataframe | |
Convert objects to data.frames before returning to ipython. |
%Rget
%Rget [-d] output
Return an object from rpy2, possibly as a structured array (if possible). Similar to Rpull except only one argument is accepted and the value is returned rather than pushed to self.shell.user_ns:
In [3]: dtype=[('x', '<i4'), ('y', '<f8'), ('z', '|S1')]
In [4]: datapy = np.array([(1, 2.9, 'a'), (2, 3.5, 'b'), (3, 2.1, 'c'), (4, 5, 'e')], dtype=dtype)
In [5]: %R -i datapy
In [6]: %Rget datapy
Out[6]:
array([['1', '2', '3', '4'],
['2', '3', '2', '5'],
['a', 'b', 'c', 'e']],
dtype='|S1')
In [7]: %Rget -d datapy
Out[7]:
array([(1, 2.9, 'a'), (2, 3.5, 'b'), (3, 2.1, 'c'), (4, 5.0, 'e')],
dtype=[('x', '<i4'), ('y', '<f8'), ('z', '|S1')])
-d, --as_dataframe | |
Convert objects to data.frames before returning to ipython. |