junction_load loads in raw patient and control junction data and formats it into a RangedSummarizedExperiment-class object. Control samples can be the user's in-house samples or selected from GTEx v6 data publicly released through the recount2 and downloaded through snaptron. By default, junction_load expects the junction data to be in STAR aligned format (SJ.out) but this can be modified via the argument load_func.

junction_load(
  junction_paths,
  metadata = dplyr::tibble(samp_id = stringr::str_c("samp_",
    seq_along(junction_paths))),
  controls = rep(FALSE, length(junction_paths)),
  load_func = .STAR_load,
  chrs = NULL,
  coord_system = 1
)

Arguments

junction_paths

path(s) to junction data.

metadata

data.frame containing sample metadata with rows in the same order as junction_paths.

controls

either a logical vector of the same length as junction_paths with TRUE representing controls. Or, one of "fibroblasts", "lymphocytes", "skeletal_muscle", "whole_blood" representing the samples of which GTEx tissue to use as controls. By default, will assume all samples are patients.

load_func

function to load in junctions. By default, requires STAR formatted junctions (SJ.out). But this can be switched dependent on the format of the user's junction data. Function must take as input a junction path then return a data.frame with the columns "chr", "start", "end", "strand" and "count".

chrs

chromosomes to keep. By default, no filter is applied.

coord_system

1 (1-based) or 0 (0-based) denoting the co-ordinate system corresponding to the user junctions from junction_paths. Only used when controls is set to "fibroblasts" to ensure GTEx data is harmonised to match the co-ordinate system of the user's junctions.

Value

RangedSummarizedExperiment-classobject containing junction data.

Examples


junctions_example_1_path <-
    system.file("extdata",
        "junctions_example_1.txt",
        package = "dasper",
        mustWork = TRUE
    )
junctions_example_2_path <-
    system.file("extdata",
        "junctions_example_2.txt",
        package = "dasper",
        mustWork = TRUE
    )

junctions <-
    junction_load(
        junction_paths = c(junctions_example_1_path, junctions_example_2_path)
    )
#> [1] "2022-03-26 18:59:16 - Loading junctions for sample 1/2..."
#> [1] "2022-03-26 18:59:16 - Loading junctions for sample 2/2..."
#> [1] "2022-03-26 18:59:16 - Adding control junctions..."
#> [1] "2022-03-26 18:59:16 - Tidying and storing junction data as a RangedSummarizedExperiment..."
#> [1] "2022-03-26 18:59:16 - done!"

junctions
#> class: RangedSummarizedExperiment 
#> dim: 19733 2 
#> metadata(0):
#> assays(1): raw
#> rownames: NULL
#> rowData names(0):
#> colnames(2): count_1 count_2
#> colData names(2): samp_id case_control