junction_load
loads in raw patient and control junction data and formats it
into a
RangedSummarizedExperiment-class
object. Control samples can be the user's in-house samples or selected from
GTEx v6 data publicly released through the
recount2 and
downloaded through snaptron. By default,
junction_load
expects the junction data to be in STAR aligned format
(SJ.out) but this can be modified via the argument load_func
.
path(s) to junction data.
data.frame containing sample metadata with rows in the same
order as junction_paths
.
either a logical vector of the same length as
junction_paths
with TRUE representing controls. Or, one of "fibroblasts",
"lymphocytes", "skeletal_muscle", "whole_blood" representing the samples of
which GTEx tissue to use as controls. By default, will assume all samples
are patients.
function to load in junctions. By default, requires STAR formatted junctions (SJ.out). But this can be switched dependent on the format of the user's junction data. Function must take as input a junction path then return a data.frame with the columns "chr", "start", "end", "strand" and "count".
chromosomes to keep. By default, no filter is applied.
1 (1-based) or 0 (0-based) denoting the co-ordinate
system corresponding to the user junctions from junction_paths
. Only used
when controls is set to "fibroblasts" to ensure GTEx data is harmonised to
match the co-ordinate system of the user's junctions.
RangedSummarizedExperiment-classobject containing junction data.
junctions_example_1_path <-
system.file("extdata",
"junctions_example_1.txt",
package = "dasper",
mustWork = TRUE
)
junctions_example_2_path <-
system.file("extdata",
"junctions_example_2.txt",
package = "dasper",
mustWork = TRUE
)
junctions <-
junction_load(
junction_paths = c(junctions_example_1_path, junctions_example_2_path)
)
#> [1] "2022-03-26 18:59:16 - Loading junctions for sample 1/2..."
#> [1] "2022-03-26 18:59:16 - Loading junctions for sample 2/2..."
#> [1] "2022-03-26 18:59:16 - Adding control junctions..."
#> [1] "2022-03-26 18:59:16 - Tidying and storing junction data as a RangedSummarizedExperiment..."
#> [1] "2022-03-26 18:59:16 - done!"
junctions
#> class: RangedSummarizedExperiment
#> dim: 19733 2
#> metadata(0):
#> assays(1): raw
#> rownames: NULL
#> rowData names(0):
#> colnames(2): count_1 count_2
#> colData names(2): samp_id case_control