Search
Question: Is possible to apply 'svaseq' with 'DEXSeq' design in order to remove unknown batch effects ?
0
gravatar for tofukaj
9 months ago by
tofukaj0
tofukaj0 wrote:

Dear great R helpers,

I would like to know if it is possible to apply 'svaseq' with 'DEXSeq' design in order to remove unknown batch effects similar to those greatly demonstrated with 'DESeq' (http://www.bioconductor.org/help/workflows/rnaseqGene/).

Due to my poor background in bioinformatics and the difference between DEXSeq  and DESeq design, it would be very kind of you if an example could be provided.

 

With Respects,

Kaj Chokeshaiusaha

ADD COMMENTlink modified 9 months ago by Alejandro Reyes1.5k • written 9 months ago by tofukaj0
1
gravatar for Alejandro Reyes
9 months ago by
Alejandro Reyes1.5k
Dana-Farber Cancer Institute, Boston, USA
Alejandro Reyes1.5k wrote:

Hi Kaj, 

I have not done that myself. However it should be possible to run svaseq and then add the surrogate variables as a covariate of the DEXSeq models, as indicated by section 4 of the DEXSeq vignette "Additional technical or experimental variables".

Alejandro

ADD COMMENTlink written 9 months ago by Alejandro Reyes1.5k

Thank you very much, I will definitely have a look at the section you referred.

ADD REPLYlink written 9 months ago by tofukaj0

Dear Alejandro,

I have tried svaseq analysis with  pasilla data in DEXSeq and have got an error. The script was as following


library( "DEXSeq" )
library( "sva" )

inDir = system.file("extdata", package="pasilla")
countFiles = list.files(inDir, pattern="fb.txt$", full.names=TRUE)
basename(countFiles)
flattenedFile = list.files(inDir, pattern="gff$", full.names=TRUE)
basename(flattenedFile)

sampleTable = data.frame(row.names = c( "treated1", "treated2", "treated3", "untreated1", "untreated2", "untreated3", "untreated4" ),
                         condition = c("knockdown", "knockdown", "knockdown", "control", "control", "control", "control" ),
                           libType = c( "single-end", "paired-end", "paired-end", "single-end", "single-end", "paired-end", "paired-end" ) )

##----Generate DEXSeqDataSet----------##
dxd = DEXSeqDataSetFromHTSeq(countFiles,
                             sampleData=sampleTable,
                             design= ~ sample + exon + condition:exon,
                             flattenedfile=flattenedFile )
dxd <- estimateSizeFactors(dxd)
dxd <- dxd[rowMeans(counts(dxd))>1,]

##------------Try sva----------##
dat <- counts(dxd, normalized=TRUE)

mod1 <- model.matrix(~ sample + exon + condition:exon, colData(dxd))
mod0 <- model.matrix(~ sample + exon, colData(dxd))
svseq <- svaseq(dat, mod1, mod0, n.sv=NULL, B=5)


Error in solve.default(t(mod) %*% mod) :
  system is computationally singular: reciprocal condition number = 4.47343e-18


Would you please suggest me what is wrong here?

Best Regards,

Kaj

ADD REPLYlink written 9 months ago by tofukaj0

Hi Kaj, 

For svaseq, maybe it is unnecessary to specify the DEXSeq interaction model if you are looking for unknown covariates. I would  try to estimate the surrogate variables for exon level counts (extracted using the featureCounts accessor) or gene level counts using "simpler" models:

mod1 <- ~ condition
mod0 <- ~ 1

And then plug in the resulting surrogate variables into the DEXSeqDataSet object. 

ADD REPLYlink written 9 months ago by Alejandro Reyes1.5k

Thank you very much!! I hope this one looks ok for you.


library( "DEXSeq" )
library( "DESeq2" )
library( "sva" )

inDir = system.file("extdata", package="pasilla")
countFiles = list.files(inDir, pattern="fb.txt$", full.names=TRUE)
basename(countFiles)
flattenedFile = list.files(inDir, pattern="gff$", full.names=TRUE)
basename(flattenedFile)

sampleTable = data.frame(row.names = c( "treated1", "treated2", "treated3", "untreated1", "untreated2", "untreated3", "untreated4" ),
                         condition = c("knockdown", "knockdown", "knockdown", "control", "control", "control", "control" ),
                           libType = c( "single-end", "paired-end", "paired-end", "single-end", "single-end", "paired-end", "paired-end" ) )

##----Generate DEXSeqDataSet---##
dxd = DEXSeqDataSetFromHTSeq(countFiles,
                             sampleData=sampleTable,
                             design= ~ sample + exon + condition:exon,
                             flattenedfile=flattenedFile )

##----Select only some genes----##
genesForSubset = read.table(file.path(inDir, "geneIDsinsubset.txt"),
                            stringsAsFactors=FALSE)[[1]]
dxd = dxd[geneIDs( dxd ) %in% genesForSubset,]
dxd <- estimateSizeFactors(dxd)
dxd <- dxd[rowMeans(counts(dxd))>1,]

##------------sva----------##
dat <- counts(dxd, normalized=TRUE)

mod1 <- model.matrix(~ sample + exon , colData(dxd))
mod0 <- model.matrix(~ 1, colData(dxd))
svseq <- svaseq(dat, mod1, mod0, n.sv=NULL, B=5)
#Number of significant surrogate variables is:  2

##-----Create DEXSeqDataSet with acquired surrogate vectors-----##
dxdsva = dxd
dxdsva$SV1=svseq$sv[,1]
dxdsva$SV2=svseq$sv[,2]
design(dxdsva) <- ~ SV1 + SV2 + sample + exon + condition:exon

##-------Test for differential exon usage--------##
dxdsva <- estimateDispersions( dxdsva )
dxdsva = testForDEU( dxdsva )
dxdsvaR = DEXSeqResults( dxdsva )

dxd <- estimateDispersions( dxd )
dxd = testForDEU(dxd)
dxdR = DEXSeqResults( dxd )


##-------Results--------##
table ( dxdsvaR$padj < 0.1 )
# FALSE  TRUE
#   397    20
table ( dxdR$padj < 0.1 )
# FALSE  TRUE
#   434    17


Best Regards,

Kaj

ADD REPLYlink written 9 months ago by tofukaj0
1

Hi Kaj, 

In the design of the DEXSeqDataSet object, you need the formula to be like this:

design(dxdsva) <- ~ sample + exon + SV1:exon + SV2:exon + condition:exon

Also, make sure that you pass the reduced model as a parameter to the function testForDEU:

fullModel <- ~ sample + exon + SV1:exon + SV2:exon + condition:exon
reducedModel <- ~ sample + exon + SV1:exon + SV2:exon
dxdsva = testForDEU( dxdsva, reducedModel=reducedModel, fullModel=fullModel)

That should do the job!

ADD REPLYlink modified 9 months ago • written 9 months ago by Alejandro Reyes1.5k

Thank you very very much!!

ADD REPLYlink written 9 months ago by tofukaj0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 260 users visited in the last hour