Visualise RNA-seq data in a the form of a sashimi plot

plot_sashimi plots the splicing events and coverage across specific genes/transcripts/regions of interest. Unlike traditional sashimi plots, coverage and junction tracks are separated, which enables user's to choose whether they would like to plot only the junctions.

plot_sashimi(
  junctions,
  ref,
  gene_tx_id,
  gene_tx_col,
  case_id = NULL,
  sum_func = mean,
  region = NULL,
  assay_name = "norm",
  annot_colour = c(ggpubr::get_palette("jco", 1), ggpubr::get_palette("npg", 7)[c(1, 3,
    2, 5, 6)], ggpubr::get_palette("jco", 6)[c(3)]),
  digits = 2,
  count_label = TRUE,
  coverage_paths_case = NULL,
  coverage_paths_control = NULL,
  coverage_chr_control = NULL,
  load_func = .coverage_load,
  binwidth = 100
)

Arguments

junctions: junction data as a RangedSummarizedExperiment-class object.
ref: either path to gtf/gff3 or object of class TxDb-class or EnsDb-class. EnsDb-class is required if you intend to annotate junctions with gene symbols/names.
gene_tx_id: character scalar with the id of the gene. This must be a an identifier for a gene or transcript, which has a matching entry in ref.
gene_tx_col: character scalar with the name of the column to search for the gene_tx_id in ref.
case_id: list containing 1 element. The contents of this element must be a character vector specifying sample ids that are to be plotted. The name of this element must correspond to the column containing sample ids in the junction SummarizedExperiment::mcols(). By default, all cases will be plotted.
sum_func: function that will be used to aggregate the junction counts and coverage for controls. By default, mean will be used.
region: a GenomicRanges of length 1 that is used to filter the exons/junctions plotted. Only those that overlap this region are plotted.
assay_name: a character scalar with the name of the SummarizedExperiment::assay() from which to obtain junction counts.
annot_colour: character vector length 7, representing the colours of junction types.
digits: used in round(), specifying the number of digits to round the junction counts to for visualisation purposes.
count_label: logical value specifying whether to add label the count of each junction.
coverage_paths_case: paths to the BigWig files containing the coverage of your case samples.
coverage_paths_control: paths to the BigWig files for control samples.
coverage_chr_control: either "chr" or "no_chr", indicating the chromosome format of control coverage data. Only required if the chromosome format of the control BigWig files is different to that of your cases.
load_func: function used to load coverage.
binwidth: the number of bases to aggregate coverage across using sum_func when plotting. .

Value

ggplot displaying the splicing (and coverage) surrounding the transcript/region of interest.

Examples


# use GenomicState to load txdb (GENCODE v31)
ref <- GenomicState::GenomicStateHub(
    version = "31",
    genome = "hg38",
    filetype = "TxDb"
)[[1]]
#> snapshotDate(): 2021-10-20
#> loading from cache

junctions_processed <- junction_process(
    junctions_example,
    ref,
    types = c("ambig_gene", "unannotated")
)
#> [1] "# Filtering junctions -----------------------------------------------------"
#> [1] "2022-03-26 19:01:05 - Filtering junctions..."
#> [1] "2022-03-26 19:01:05 - by count..."
#> [1] "2022-03-26 19:01:05 - done!"
#> [1] "# Annotating junctions ----------------------------------------------------"
#> [1] "2022-03-26 19:01:05 - Obtaining co-ordinates of annotated exons and junctions..."
#> [1] "2022-03-26 19:01:19 - Getting junction annotation using overlapping exons..."
#> [1] "2022-03-26 19:01:20 - Tidying junction annotation..."
#> [1] "2022-03-26 19:01:20 - Deriving junction categories..."
#> [1] "2022-03-26 19:01:21 - done!"
#> [1] "# Filtering junctions -----------------------------------------------------"
#> [1] "2022-03-26 19:01:21 - Filtering junctions..."
#> [1] "2022-03-26 19:01:21 - by type..."
#> [1] "2022-03-26 19:01:21 - done!"
#> [1] "# Normalise junctions -----------------------------------------------------"
#> [1] "2022-03-26 19:01:21 - Clustering junctions..."
#> [1] "2022-03-26 19:01:21 - Normalising junction counts..."
#> [1] "2022-03-26 19:01:21 - done!"
#> [1] "# Score junctions ---------------------------------------------------------"
#> [1] "2022-03-26 19:01:21 - Calculating the direction of change of junctions..."
#> [1] "2022-03-26 19:01:21 - Generating junction abnormality score..."
#> [1] "2022-03-26 19:01:21 - done!"

sashimi_plot <- plot_sashimi(
    junctions = junction_filter(junctions_processed),
    ref = ref,
    gene_tx_id = "ENSG00000142156.14",
    gene_tx_col = "gene_id",
    sum_func = NULL
)
#> [1] "2022-03-26 19:01:22 - Filtering junctions..."
#> [1] "2022-03-26 19:01:22 - by count..."
#> [1] "2022-03-26 19:01:22 - done!"