Aim: Connect to R Studio server using docker



Background

  • R Studio server allows user’s to code using a convenient IDE, whilst harnessing the processing power of a computing cluster/server.
  • However, the installation and maintenance of the R Studio server software can be time-consuming. Also, the free version of R Studio server only permits a single version of R to be used. Assuming you are not willing to pay, this can limit analyses that depend on packages across multiple versions of R.
  • Using an docker image from the rocker project with R/R Studio server pre-installed, you can easily bypass the two limitations above. A user can connect to the rocker image via an ssh tunnel, accessing R Studio server through their local web browser. Each docker image can have a unique installation of R Studio server, thus any number of R versions can be used.
  • The following guide with demonstrate how to install and connect to a R Studio server via a rocker image on a remote server.
  • This guide does not cover the fundamentals of docker itself and it is recommended that anyone using this guide should already have a basic proficiency with docker.



Guide

Docker installation

You must have docker installed on your system. To check you have docker installed, you can use:


# based on: https://www.digitalocean.com/community/questions/how-to-check-for-docker-installation
docker -v
echo $?

If you don’t, install docker. A guide to installing docker can be found here.



Download a docker image with R Studio server pre-installed

In order to use R Studio server, a docker image with R Studio server pre-installed must be downloaded. Bioconductor releases it’s own image based on the rocker project, with other useful resources for analyses of biological data pre-installed, such as core Bioconductor packages. You can download the Bioconductor docker image using:


# the release 3.13 version of Bioconductor rocker image is used here
# be sure to check for an updated version as and when you use this guide
sudo docker pull bioconductor/bioconductor_docker:RELEASE_3_13



Start a R Studio server process

Next, we will create a process running R Studio server on the Bioconductor docker image downloaded above. To do this, you need to use various flags together with the docker command. For convenience, I have created a R wrapper function to run these docker commands within the function rutils::docker_run_rserver. This calls the relevant docker commands from within R with arguments for the relevant flags, which are explained below.

First, open an R terminal and install the rutils package from GitHub via:

# this requires R version >= 4.0
devtools::install_github("dzhang32/rutils")

Next, run the rutils::docker_run_rserver() function within the R terminal. At a minimum, you should set the image, port and name arguments explained below. Setting the verbose argument to TRUE will print the flags that were used within the docker command and can be useful for debugging or logging your session.

rutils::docker_run_rserver(
  image = "bioconductor/bioconductor_docker:RELEASE_3_13", # rocker image
  port = 8787, # port on which the host will have present R Studio server
  name = "example", # name of docker process
  verbose = TRUE # whether to print out the flags passed to the docker command
)



Connecting to a R Studio server process

Now that the R Studio server process is running, you can now map the localhost of your local machine to the port on the remote server presenting R Studio server (specified above as 8787). An example ssh command is shown below and should be run on your local terminal:


# make sure the port specified here matches the port you have used above in 
# rutils::docker_run_rserver()
ssh -i path_to_pem.pem \
-X -N -f -L localhost:8787:localhost:8787 \
user@ip

If the above ssh command has run successfully, you will now be able to access R Studio server by going to the address localhost:8787 on your local browser. The default login details for the Bioconductor docker are:

Username: rstudio

Password: bioc

More details of the Bioconductor docker can be found here.



Mounting volumes

Most analyses relies on data that is stored on the original host, therefore not (by default) accessible by the docker process. Therefore, it is often useful to mount the required files, allowing them to be accessible by the docker process. Mounting can be configured using rutils::docker_run_rserver() via the arguments volumes, volumes_ro.

The user permissions for accessing the mounted volumes are dictated by the USERID and GROUPID arguments. These should be set matching the user you would like to mirror the permissions of. On linux, the USERID and GROUPID of the current user can be obtained via the bash command id.

Below is an example of running rutils::docker_run_rserver() whilst mounting volumes:

# volumes - paths will be mounted with user permissions
# matching user specified by the USERID and GROUPID arguments
# volumes_ro - paths will be mounted with read-only access
rutils::docker_run_rserver(
  image = "bioconductor/bioconductor_docker:RELEASE_3_13",
  port = 8787,
  name = "example_2",
  verbose = TRUE,
  volumes = c(
    "/path/to/mounted/dir"
  ),
  volumes_ro = c(
    "/path/to/mounted/dir"
  ),
  permissions = "match",
  USERID = 1000,
  GROUPID = 1000
)



Reproducibility

## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.1.1 (2021-08-10)
##  os       Ubuntu 20.04.3 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language en
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       UTC
##  date     2022-02-04
##  pandoc   2.11.4 @ /usr/local/bin/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
##  package     * version    date (UTC) lib source
##  bookdown      0.24       2021-09-02 [1] RSPM (R 4.1.0)
##  bslib         0.3.1      2021-10-06 [1] RSPM (R 4.1.0)
##  cachem        1.0.6      2021-08-19 [2] RSPM (R 4.1.0)
##  cli           3.1.1      2022-01-20 [2] RSPM (R 4.1.0)
##  crayon        1.4.2      2021-10-29 [2] RSPM (R 4.1.0)
##  desc          1.4.0      2021-09-28 [2] RSPM (R 4.1.0)
##  digest        0.6.29     2021-12-01 [2] RSPM (R 4.1.0)
##  evaluate      0.14       2019-05-28 [2] RSPM (R 4.1.0)
##  fastmap       1.1.0      2021-01-25 [2] RSPM (R 4.1.0)
##  fs            1.5.2      2021-12-08 [2] RSPM (R 4.1.0)
##  htmltools     0.5.2      2021-08-25 [1] RSPM (R 4.1.0)
##  jquerylib     0.1.4      2021-04-26 [1] RSPM (R 4.1.0)
##  jsonlite      1.7.3      2022-01-17 [2] RSPM (R 4.1.0)
##  knitr       * 1.37       2021-12-16 [2] RSPM (R 4.1.0)
##  magrittr      2.0.2      2022-01-26 [2] RSPM (R 4.1.0)
##  memoise       2.0.1      2021-11-26 [2] RSPM (R 4.1.0)
##  pkgdown       2.0.2.9000 2022-01-30 [1] Github (r-lib/pkgdown@ae7363f)
##  purrr         0.3.4      2020-04-17 [2] RSPM (R 4.1.0)
##  R6            2.5.1      2021-08-19 [2] RSPM (R 4.1.0)
##  ragg          1.2.1      2021-12-06 [1] RSPM (R 4.1.0)
##  rlang         1.0.0      2022-01-26 [2] RSPM (R 4.1.0)
##  rmarkdown     2.11       2021-09-14 [1] RSPM (R 4.1.0)
##  rprojroot     2.0.2      2020-11-15 [2] RSPM (R 4.1.0)
##  sass          0.4.0      2021-05-12 [1] RSPM (R 4.1.0)
##  sessioninfo * 1.2.2      2021-12-06 [2] RSPM (R 4.1.0)
##  stringi       1.7.6      2021-11-29 [2] RSPM (R 4.1.0)
##  stringr       1.4.0      2019-02-10 [2] RSPM (R 4.1.0)
##  systemfonts   1.0.3      2021-10-13 [1] RSPM (R 4.1.0)
##  textshaping   0.3.6      2021-10-13 [1] RSPM (R 4.1.0)
##  xfun          0.29       2021-12-14 [2] RSPM (R 4.1.0)
##  yaml          2.2.2      2022-01-25 [2] RSPM (R 4.1.0)
## 
##  [1] /__w/_temp/Library
##  [2] /usr/local/lib/R/site-library
##  [3] /usr/local/lib/R/library
## 
## ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────