Question

High Memory Usage when running FRASER's countRNAData

0

Entering edit mode

jbendik • 0

@b2d26a33

Last seen 8 months ago

United States

Hello!

I would like to use FRASER on a set of 72 BAM files with RNA-seq data for 2 groups aligned using STAR. I have been attempting to run countRNAData on our cluster, with a maximum allocation of 800GB of memory, but it does not seem to be enough. It does seem to work if I run each sample individually, however I believe FRASER requires the fds object to be created using a table of all BAM files. When looking at the resulting fds object for one of my samples I do see a great number of junctions (>300,000) and splice sites (>200,000).

Would it be possible to create a separate fds object for each sample and combine them? Or would someone who has used this package before know a better way I could get this running?


library(FRASER)

sampleTable <- fread("FRASERsampleTable.txt")
bamFiles <- sampleTable[,bamFile]
sampleTable[,bamFile:=bamFiles]
settings <- FraserDataSet(colData=sampleTable, workingDir='./fraserCountsOutput', strandSpecific=as.integer(2))

if(.Platform$OS.type == "unix") {
register(MulticoreParam(workers=min(8, multicoreWorkers())))
}

fds <- countRNAData(settings)
save(fds, file="./fraserCountsOutput/jhh_FRASER_counts.RData")

FRASER • 382 views

ADD COMMENT • link 12 months ago jbendik • 0