Question: Parallel computing for IsoformSwitchAnalyzeR
0
gravatar for maya.kappil
3 months ago by
maya.kappil10
maya.kappil10 wrote:

Hello! I am interested to run the IsoformSwitchAnalyzeR package to assess differential transcript usage and related downstream analysis. However, I have a large set of RNAseq samples (200) - for the native DRIMSeq and DEXSeq packages, it is possible to add a command to indicate the number of cores to utilize to reduce the running time. For example:

  d <- dmPrecision(d, design=design_full,BPPARAM = BiocParallel::SerialParam())
  d <- dmFit(d, design=design_full,BPPARAM = BiocParallel::SerialParam())
  d <- dmTest(d, coef="condition2",BPPARAM = BiocParallel::SerialParam())

Do you have suggestions for how to accomplish this when running the isoformSwitchTestDEXSeq or isoformSwitchTestDRIMSeq commands?

Thanks!

isoformswitchanalyzer • 124 views
ADD COMMENTlink modified 3 months ago by k.vitting.seerup90 • written 3 months ago by maya.kappil10
Answer: Parallel computing for IsoformSwitchAnalyzeR
1
gravatar for k.vitting.seerup
3 months ago by
European Union
k.vitting.seerup90 wrote:

Hi Maya

Thanks for reaching out. Unfortunately there is no way of doing that from within IsoformSwitchAnalyzeR at the moment. That said the runtime of DRIMSeq mostly scales with the number of transcripts analyzed (and a little bit with the number of samples) so if you just make sure to use preFilter() first isoformSwitchTestDRIMSeq() should run in 30-60 min for ~30k transcripts.

Cheers Kristoffer

ADD COMMENTlink written 3 months ago by k.vitting.seerup90

Hmm... I have somewhat more transcripts using the default preFilter setting (~80K). And, the calculated time to run DEXSeq seemed really large (17763754)... So far, I've tried running it for 15hrs (both correcting for a confounding variable and without correction) and they did not go to completion. I will try a more stringent filter to see if that helps. But just in case there is something obvious that I'm specifying incorrectly, below is the code I used:

Generate switchlist

base.dir<-getwd()

salmonQuant <- importIsoformExpression( + parentDir = paste0(base.dir,"/quants"),addIsofomIdAsColumn=TRUE)

coldata<-read.csv("Covariates.csv") summary(coldata$BW) AGA LGA SGA 112 55 33 coldata$batch<-factor(coldata$batch)

myDesign<-data.frame(sampleID=coldata$ID, + condition=coldata$BW, + batch=coldata$batch)

contrast<-data.frame(condition1="AGA",condition2="SGA")

aSwitchList_SGA <- importRdata( + isoformCountMatrix = salmonQuant$counts, + isoformRepExpression = salmonQuant$abundance, + designMatrix = myDesign, + comparisonsToMake=contrast, + isoformExonAnnoation = paste0(base.dir,"/gencode.v28.annotation.gtf.gz"), + showProgress = FALSE + )

aSwitchList_SGA This switchAnalyzeRlist list contains: 199338 isoforms from 57599 genes 1 comparison from 2 conditions Feature analyzed: [1] "ORFs"

Filter

aSwitchListSGA <- preFilter( + switchAnalyzeRlist = aSwitchListSGA, + geneExpressionCutoff = 1, + isoformExpressionCutoff = 0, + removeSingleIsoformGenes = TRUE + ) The fitering removed 121323 ( 60.86% of ) transcripts. There is now 78015 isoforms left

DEXSeq

aSwitchListSGADEXSeq <- isoformSwitchTestDEXSeq( + switchAnalyzeRlist = aSwitchList_SGA, + correctForConfoundingFactors=TRUE, + reduceToSwitchingGenes=TRUE + ) Step 1 of 4: Correcting for batch effect... Step 2 of 4: Calculating Isoform Fraction matrix... Step 3 of 4: Testing each pariwise comparisons with DEXSeq (this might be a bit slow)... Estimated time (for dataset with ~30.000 isoforms): 17763754 min

ADD REPLYlink written 3 months ago by maya.kappil10
1

You are right that DEXSeq is infeasible (it scales exponentially with number of samples) so it would need to be DRIMSeq. Your code looks good. You could try filtering a bit more strict - I would suggest setting IFcutoff=0.1 - meaning you are only considering isoforms which on average contribute with at least 10% to the parent gene expression.

ADD REPLYlink written 3 months ago by k.vitting.seerup90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 167 users visited in the last hour