Is possible to apply 'svaseq' with 'DEXSeq' design in order to remove unknown batch effects ?
1
0
Entering edit mode
tofukaj • 0
@tofukaj-9928
Last seen 3.8 years ago

Dear great R helpers,

I would like to know if it is possible to apply 'svaseq' with 'DEXSeq' design in order to remove unknown batch effects similar to those greatly demonstrated with 'DESeq' (http://www.bioconductor.org/help/workflows/rnaseqGene/).

Due to my poor background in bioinformatics and the difference between DEXSeq  and DESeq design, it would be very kind of you if an example could be provided.

 

With Respects,

Kaj Chokeshaiusaha

svaseq dexseq • 1.8k views
ADD COMMENT
1
Entering edit mode
Alejandro Reyes ★ 1.9k
@alejandro-reyes-5124
Last seen 6 days ago
Novartis Institutes for BioMedical Rese…

Hi Kaj, 

I have not done that myself. However it should be possible to run svaseq and then add the surrogate variables as a covariate of the DEXSeq models, as indicated by section 4 of the DEXSeq vignette "Additional technical or experimental variables".

Alejandro

ADD COMMENT
0
Entering edit mode

Thank you very much, I will definitely have a look at the section you referred.

ADD REPLY
0
Entering edit mode

Dear Alejandro,

I have tried svaseq analysis with  pasilla data in DEXSeq and have got an error. The script was as following


library( "DEXSeq" )
library( "sva" )

inDir = system.file("extdata", package="pasilla")
countFiles = list.files(inDir, pattern="fb.txt$", full.names=TRUE)
basename(countFiles)
flattenedFile = list.files(inDir, pattern="gff$", full.names=TRUE)
basename(flattenedFile)

sampleTable = data.frame(row.names = c( "treated1", "treated2", "treated3", "untreated1", "untreated2", "untreated3", "untreated4" ),
                         condition = c("knockdown", "knockdown", "knockdown", "control", "control", "control", "control" ),
                           libType = c( "single-end", "paired-end", "paired-end", "single-end", "single-end", "paired-end", "paired-end" ) )

##----Generate DEXSeqDataSet----------##
dxd = DEXSeqDataSetFromHTSeq(countFiles,
                             sampleData=sampleTable,
                             design= ~ sample + exon + condition:exon,
                             flattenedfile=flattenedFile )
dxd <- estimateSizeFactors(dxd)
dxd <- dxd[rowMeans(counts(dxd))>1,]

##------------Try sva----------##
dat <- counts(dxd, normalized=TRUE)

mod1 <- model.matrix(~ sample + exon + condition:exon, colData(dxd))
mod0 <- model.matrix(~ sample + exon, colData(dxd))
svseq <- svaseq(dat, mod1, mod0, n.sv=NULL, B=5)


Error in solve.default(t(mod) %*% mod) :
  system is computationally singular: reciprocal condition number = 4.47343e-18


Would you please suggest me what is wrong here?

Best Regards,

Kaj

ADD REPLY
0
Entering edit mode

Hi Kaj, 

For svaseq, maybe it is unnecessary to specify the DEXSeq interaction model if you are looking for unknown covariates. I would  try to estimate the surrogate variables for exon level counts (extracted using the featureCounts accessor) or gene level counts using "simpler" models:

mod1 <- ~ condition
mod0 <- ~ 1

And then plug in the resulting surrogate variables into the DEXSeqDataSet object. 

ADD REPLY
0
Entering edit mode

Thank you very much!! I hope this one looks ok for you.


library( "DEXSeq" )
library( "DESeq2" )
library( "sva" )

inDir = system.file("extdata", package="pasilla")
countFiles = list.files(inDir, pattern="fb.txt$", full.names=TRUE)
basename(countFiles)
flattenedFile = list.files(inDir, pattern="gff$", full.names=TRUE)
basename(flattenedFile)

sampleTable = data.frame(row.names = c( "treated1", "treated2", "treated3", "untreated1", "untreated2", "untreated3", "untreated4" ),
                         condition = c("knockdown", "knockdown", "knockdown", "control", "control", "control", "control" ),
                           libType = c( "single-end", "paired-end", "paired-end", "single-end", "single-end", "paired-end", "paired-end" ) )

##----Generate DEXSeqDataSet---##
dxd = DEXSeqDataSetFromHTSeq(countFiles,
                             sampleData=sampleTable,
                             design= ~ sample + exon + condition:exon,
                             flattenedfile=flattenedFile )

##----Select only some genes----##
genesForSubset = read.table(file.path(inDir, "geneIDsinsubset.txt"),
                            stringsAsFactors=FALSE)[[1]]
dxd = dxd[geneIDs( dxd ) %in% genesForSubset,]
dxd <- estimateSizeFactors(dxd)
dxd <- dxd[rowMeans(counts(dxd))>1,]

##------------sva----------##
dat <- counts(dxd, normalized=TRUE)

mod1 <- model.matrix(~ sample + exon , colData(dxd))
mod0 <- model.matrix(~ 1, colData(dxd))
svseq <- svaseq(dat, mod1, mod0, n.sv=NULL, B=5)
#Number of significant surrogate variables is:  2

##-----Create DEXSeqDataSet with acquired surrogate vectors-----##
dxdsva = dxd
dxdsva$SV1=svseq$sv[,1]
dxdsva$SV2=svseq$sv[,2]
design(dxdsva) <- ~ SV1 + SV2 + sample + exon + condition:exon

##-------Test for differential exon usage--------##
dxdsva <- estimateDispersions( dxdsva )
dxdsva = testForDEU( dxdsva )
dxdsvaR = DEXSeqResults( dxdsva )

dxd <- estimateDispersions( dxd )
dxd = testForDEU(dxd)
dxdR = DEXSeqResults( dxd )


##-------Results--------##
table ( dxdsvaR$padj < 0.1 )
# FALSE  TRUE
#   397    20
table ( dxdR$padj < 0.1 )
# FALSE  TRUE
#   434    17


Best Regards,

Kaj

ADD REPLY
1
Entering edit mode

Hi Kaj, 

In the design of the DEXSeqDataSet object, you need the formula to be like this:

design(dxdsva) <- ~ sample + exon + SV1:exon + SV2:exon + condition:exon

Also, make sure that you pass the reduced model as a parameter to the function testForDEU:

fullModel <- ~ sample + exon + SV1:exon + SV2:exon + condition:exon
reducedModel <- ~ sample + exon + SV1:exon + SV2:exon
dxdsva = testForDEU( dxdsva, reducedModel=reducedModel, fullModel=fullModel)

That should do the job!

ADD REPLY
0
Entering edit mode

Thank you very very much!!

ADD REPLY

Login before adding your answer.

Traffic: 1063 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6