Detecting differential usage of introns from RNA-seq data.
1
0
Entering edit mode
@fong-chun-chan-5706
Last seen 10.3 years ago
Hi, I am interested in trying to detect the intron retention in RNA-Seq I was wondering if anyone how ever tried to apply DEXSeq to looking for differential intron usage between two groups of samples. It seems like an ideal software that could detect for this given that the features are just introns now instead of exons. Or does anyone else recommend any other Bioconductor package that looks for differential intron retention? Thanks, Fong [[alternative HTML version deleted]]
DEXSeq DEXSeq • 2.4k views
ADD COMMENT
0
Entering edit mode
Alejandro Reyes ★ 1.9k
@alejandro-reyes-5124
Last seen 5 months ago
Novartis Institutes for BioMedical Reseā€¦
Dear Fong Chun Chan, Recently I have tried this, and works nicely you just need to count the reads falling in the introns and them as "exonic bins" in DEXSeq. However, I think is recommendable to have strand specific data for this, because sometimes intronic regions contain antisense transcripts that if they are differentially expressed between your conditions, they can look like intron retention differences in your transcripts. Also, when adding the introns as "exonic parts" in DEXSeq, the models become big and difficult to compute, so it is necessary to use DEXSeq in the TRT context (estimateDispersionsTRT and testForDEUTRT). Best wishes, Alejandro Reyes > Hi, > > I am interested in trying to detect the intron retention in RNA-Seq I was > wondering if anyone how ever tried to apply DEXSeq to looking for > differential intron usage between two groups of samples. It seems like an > ideal software that could detect for this given that the features are just > introns now instead of exons. Or does anyone else recommend any other > Bioconductor package that looks for differential intron retention? > > Thanks, > > Fong > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Hi Alexjandro, Thanks for the reply. I've tried to use DEXSeq for intronic regions just like how I did it with exons and I am running in this the problem when I am trying to estimatelog2FoldChanges. This is the error: exonCountSet <- estimatelog2FoldChanges( exonCountSet ) Error in do.call(`[[<-`, c(quote(coefIndices), as.list(lvlTbl[i, ]), coefNames[i])) : [[ ]] subscript out of bounds My code to get to this point is (ignore the fact that the variables are called exon. I am just copying the code from my exonic DEXSeq run): ----------------------------- print('Building ExonCountSet ...') exonCountSet <- newExonCountSet( countData = sumExonsDf[selectedExons, samples], design = designMat, geneIDs = sumExonsDf[selectedExons, 'geneID'], exonID = sumExonsDf[selectedExons, 3], exonIntervals = exonAnnotDf[selectedExons, c('chr', 'start', 'end', 'strand')] ) print('... Done') print('Estimating size factors ...') exonCountSet <- estimateSizeFactors(exonCountSet) print('... Done') print('Estimating dispersions ...') if ( opt$trt ){ print('Using the TRT functions ...') exonCountSet <- estimateDispersionsTRT( exonCountSet, nCores = opt$nCores ) } else{ exonCountSet <- estimateDispersions(exonCountSet, nCores = opt$nCores) } print('... Done') print('Fitting dispersions ...') exonCountSet <- fitDispersionFunction( exonCountSet ) print('... Done') print('Testing for differential exon usage ...') if ( opt$trt ){ print('Using the TRT functions ...') exonCountSet <- testForDEUTRT( exonCountSet, nCores = opt$nCores ) } else{ exonCountSet <- testForDEU( exonCountSet, nCores = opt$nCores ) } print('... Done') exonCountSet <- estimatelog2FoldChanges( exonCountSet ) ---------------------- Any idea what is happening here? Does it have something to do with exonIDs actually? Because I don't actually have proper intronIDs so what I am actually doing is using the genomic coordinates as the intronID. So an intronID will look like: chr1:861181-861301 I wondering if this is causing a problem. Any help would be greatly appreciated. I am using DEXSeq 1.5.6 the version that I got from the svn repository so that I have access to the TRT functions. Fong On Wed, Feb 6, 2013 at 1:06 AM, Alejandro Reyes <alejandro.reyes@embl.de>wrote: > Dear Fong Chun Chan, > > Recently I have tried this, and works nicely you just need to count the > reads falling in the introns and them as "exonic bins" in DEXSeq. > > However, I think is recommendable to have strand specific data for this, > because sometimes intronic regions contain antisense transcripts that if > they are differentially expressed between your conditions, they can look > like intron retention differences in your transcripts. Also, when adding > the introns as "exonic parts" in DEXSeq, the models become big and > difficult to compute, so it is necessary to use DEXSeq in the TRT context > (estimateDispersionsTRT and testForDEUTRT). > > Best wishes, > Alejandro Reyes > > > > Hi, >> >> I am interested in trying to detect the intron retention in RNA-Seq I was >> wondering if anyone how ever tried to apply DEXSeq to looking for >> differential intron usage between two groups of samples. It seems like an >> ideal software that could detect for this given that the features are just >> introns now instead of exons. Or does anyone else recommend any other >> Bioconductor package that looks for differential intron retention? >> >> Thanks, >> >> Fong >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Dear Fong Chun Chan, Yes the IDs from the introns are causing some problems in the functions. It is because DEXSeq uses feature names from the exonCountSet object the concatenation of the gene ID and the exon ID separated by a ":" character, so introducing additional ":" causes some parsing errors. Could you replace the ":" in your exon IDs for another character? I think that should solve this problem! Alejandro Thanks for the reply. I've tried to use DEXSeq for intronic regions just like how I did it with exons and I am running in this the problem when I am trying to estimatelog2FoldChanges. This is the error: exonCountSet <- estimatelog2FoldChanges( exonCountSet ) Error in do.call(`[[<-`, c(quote(coefIndices), as.list(lvlTbl[i, ]), coefNames[i])) : [[ ]] subscript out of bounds My code to get to this point is (ignore the fact that the variables are called exon. I am just copying the code from my exonic DEXSeq run): ----------------------------- print('Building ExonCountSet ...') exonCountSet <- newExonCountSet( countData = sumExonsDf[selectedExons, samples], design = designMat, geneIDs = sumExonsDf[selectedExons, 'geneID'], exonID = sumExonsDf[selectedExons, 3], exonIntervals = exonAnnotDf[selectedExons, c('chr', 'start', 'end', 'strand')] ) print('... Done') print('Estimating size factors ...') exonCountSet <- estimateSizeFactors(exonCountSet) print('... Done') print('Estimating dispersions ...') if ( opt$trt ){ print('Using the TRT functions ...') exonCountSet <- estimateDispersionsTRT( exonCountSet, nCores = opt$nCores ) } else{ exonCountSet <- estimateDispersions(exonCountSet, nCores = opt$nCores) } print('... Done') print('Fitting dispersions ...') exonCountSet <- fitDispersionFunction( exonCountSet ) print('... Done') print('Testing for differential exon usage ...') if ( opt$trt ){ print('Using the TRT functions ...') exonCountSet <- testForDEUTRT( exonCountSet, nCores = opt$nCores ) } else{ exonCountSet <- testForDEU( exonCountSet, nCores = opt$nCores ) } print('... Done') exonCountSet <- estimatelog2FoldChanges( exonCountSet ) ---------------------- Any idea what is happening here? Does it have something to do with exonIDs actually? Because I don't actually have proper intronIDs so what I am actually doing is using the genomic coordinates as the intronID. So an intronID will look like: chr1:861181-861301 I wondering if this is causing a problem. Any help would be greatly appreciated. I am using DEXSeq 1.5.6 the version that I got from the svn repository so that I have access to the TRT functions. Fong > > > On Wed, Feb 6, 2013 at 1:06 AM, Alejandro Reyes > <alejandro.reyes at="" embl.de="" <mailto:alejandro.reyes="" at="" embl.de="">> wrote: > > Dear Fong Chun Chan, > > Recently I have tried this, and works nicely you just need to > count the reads falling in the introns and them as "exonic bins" > in DEXSeq. > > However, I think is recommendable to have strand specific data for > this, because sometimes intronic regions contain antisense > transcripts that if they are differentially expressed between your > conditions, they can look like intron retention differences in > your transcripts. Also, when adding the introns as "exonic parts" > in DEXSeq, the models become big and difficult to compute, so it > is necessary to use DEXSeq in the TRT context > (estimateDispersionsTRT and testForDEUTRT). > > Best wishes, > Alejandro Reyes > > > > Hi, > > I am interested in trying to detect the intron retention in > RNA-Seq I was > wondering if anyone how ever tried to apply DEXSeq to looking for > differential intron usage between two groups of samples. It > seems like an > ideal software that could detect for this given that the > features are just > introns now instead of exons. Or does anyone else recommend > any other > Bioconductor package that looks for differential intron retention? > > Thanks, > > Fong > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > >
ADD REPLY
0
Entering edit mode
Hi Alejandro, Thanks for your help. That solved the problem. I noticed a strange results when running DEXSeq using intronic features. All my log2FoldChange values are NaN. I am getting the following warning() messages: Warning messages: 1: In glmgam.fit(mm, disps[good], coef.start = coefs) : Too much damping - convergence tolerance not achievable 2: In fitDispersionFunction(exonCountSet) : Negative intercept value in the dispersion function, it will be set to 0. Check fit diagnostics plot section from the vignette. I checked the fit diagnostic plot section from the vignette, but I am not sure as to what could be problem. If it means anything, I was testing the code on a subset of the data which was 2000 introns. There are no NAs in my exonCountSet. Any ideas as to what is happening? Fong On Wed, Feb 6, 2013 at 2:39 PM, Alejandro Reyes <alejandro.reyes@embl.de>wrote: > Dear Fong Chun Chan, > > Yes the IDs from the introns are causing some problems in the functions. > > It is because DEXSeq uses feature names from the exonCountSet object the > concatenation of the gene ID and the exon ID separated by a ":" character, > so introducing additional ":" causes some parsing errors. Could you > replace the ":" in your exon IDs for another character? > I think that should solve this problem! > > Alejandro > > > > Thanks for the reply. I've tried to use DEXSeq for intronic regions just > like how I did it with exons and I am running in this the problem when I am > trying to estimatelog2FoldChanges. This is the error: > > exonCountSet <- estimatelog2FoldChanges( exonCountSet ) > Error in do.call(`[[<-`, c(quote(coefIndices), as.list(lvlTbl[i, ]), > coefNames[i])) : > [[ ]] subscript out of bounds > > My code to get to this point is (ignore the fact that the variables are > called exon. I am just copying the code from my exonic DEXSeq run): > ----------------------------- > > print('Building ExonCountSet ...') > exonCountSet <- newExonCountSet( countData = sumExonsDf[selectedExons, > samples], design = designMat, geneIDs = sumExonsDf[selectedExons, > 'geneID'], exonID = sumExonsDf[selectedExons, 3], exonIntervals = > exonAnnotDf[selectedExons, c('chr', 'start', 'end', 'strand')] ) > print('... Done') > > print('Estimating size factors ...') > exonCountSet <- estimateSizeFactors(**exonCountSet) > print('... Done') > > print('Estimating dispersions ...') > if ( opt$trt ){ > print('Using the TRT functions ...') > exonCountSet <- estimateDispersionsTRT( exonCountSet, nCores = > opt$nCores ) > } else{ > exonCountSet <- estimateDispersions(**exonCountSet, nCores = > opt$nCores) > } > print('... Done') > > print('Fitting dispersions ...') > exonCountSet <- fitDispersionFunction( exonCountSet ) > print('... Done') > > print('Testing for differential exon usage ...') > if ( opt$trt ){ > print('Using the TRT functions ...') > exonCountSet <- testForDEUTRT( exonCountSet, nCores = opt$nCores ) > } else{ > exonCountSet <- testForDEU( exonCountSet, nCores = opt$nCores ) > } > print('... Done') > > exonCountSet <- estimatelog2FoldChanges( exonCountSet ) > ---------------------- > > Any idea what is happening here? Does it have something to do with exonIDs > actually? Because I don't actually have proper intronIDs so what I am > actually doing is using the genomic coordinates as the intronID. So an > intronID will look like: > > chr1:861181-861301 > > I wondering if this is causing a problem. Any help would be greatly > appreciated. I am using DEXSeq 1.5.6 the version that I got from the svn > repository so that I have access to the TRT functions. > > Fong > > > >> >> On Wed, Feb 6, 2013 at 1:06 AM, Alejandro Reyes <alejandro.reyes@embl.de<mailto:>> alejandro.reyes@embl.**de <alejandro.reyes@embl.de>>> wrote: >> >> Dear Fong Chun Chan, >> >> Recently I have tried this, and works nicely you just need to >> count the reads falling in the introns and them as "exonic bins" >> in DEXSeq. >> >> However, I think is recommendable to have strand specific data for >> this, because sometimes intronic regions contain antisense >> transcripts that if they are differentially expressed between your >> conditions, they can look like intron retention differences in >> your transcripts. Also, when adding the introns as "exonic parts" >> in DEXSeq, the models become big and difficult to compute, so it >> is necessary to use DEXSeq in the TRT context >> (estimateDispersionsTRT and testForDEUTRT). >> >> Best wishes, >> Alejandro Reyes >> >> >> >> Hi, >> >> I am interested in trying to detect the intron retention in >> RNA-Seq I was >> wondering if anyone how ever tried to apply DEXSeq to looking for >> differential intron usage between two groups of samples. It >> seems like an >> ideal software that could detect for this given that the >> features are just >> introns now instead of exons. Or does anyone else recommend >> any other >> Bioconductor package that looks for differential intron retention? >> >> Thanks, >> >> Fong >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >> > >> >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: >> http://news.gmane.org/gmane.**science.biology.informatics.** >> conductor<http: news.gmane.org="" gmane.science.biology.informatics.c="" onductor=""> >> >> >> >> > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Dear Fong Chun Chan, I think it is very likely that a lot of your introns have 0 or low counts in most of the samples, therefore the fitting of the dispersions is having some problems (consequently also the variance stabilizing function for the fold changes). I would recommend to leave the exons inside the ExonCountSet object during all the DEXSeq analysis. Best regards, Alejandro > Hi Alejandro, > > Thanks for your help. That solved the problem. I noticed a strange > results when running DEXSeq using intronic features. All my > log2FoldChange values are NaN. I am getting the following warning() > messages: > > Warning messages: > 1: In glmgam.fit(mm, disps[good], coef.start = coefs) : > Too much damping - convergence tolerance not achievable > 2: In fitDispersionFunction(exonCountSet) : > Negative intercept value in the dispersion function, it will be set > to 0. Check fit diagnostics plot section from the vignette. > > I checked the fit diagnostic plot section from the vignette, but I am > not sure as to what could be problem. If it means anything, I was > testing the code on a subset of the data which was 2000 introns. There > are no NAs in my exonCountSet. Any ideas as to what is happening? > > Fong > > > On Wed, Feb 6, 2013 at 2:39 PM, Alejandro Reyes > <alejandro.reyes@embl.de <mailto:alejandro.reyes@embl.de="">> wrote: > > Dear Fong Chun Chan, > > Yes the IDs from the introns are causing some problems in the > functions. > > It is because DEXSeq uses feature names from the exonCountSet > object the concatenation of the gene ID and the exon ID separated > by a ":" character, so introducing additional ":" causes some > parsing errors. Could you replace the ":" in your exon IDs for > another character? > I think that should solve this problem! > > Alejandro > > > > Thanks for the reply. I've tried to use DEXSeq for intronic > regions just like how I did it with exons and I am running in this > the problem when I am trying to estimatelog2FoldChanges. This is > the error: > > exonCountSet <- estimatelog2FoldChanges( exonCountSet ) > Error in do.call(`[[<-`, c(quote(coefIndices), as.list(lvlTbl[i, > ]), coefNames[i])) : > [[ ]] subscript out of bounds > > My code to get to this point is (ignore the fact that the > variables are called exon. I am just copying the code from my > exonic DEXSeq run): > ----------------------------- > > print('Building ExonCountSet ...') > exonCountSet <- newExonCountSet( countData = > sumExonsDf[selectedExons, samples], design = designMat, geneIDs = > sumExonsDf[selectedExons, 'geneID'], exonID = > sumExonsDf[selectedExons, 3], exonIntervals = > exonAnnotDf[selectedExons, c('chr', 'start', 'end', 'strand')] ) > print('... Done') > > print('Estimating size factors ...') > exonCountSet <- estimateSizeFactors(exonCountSet) > print('... Done') > > print('Estimating dispersions ...') > if ( opt$trt ){ > print('Using the TRT functions ...') > exonCountSet <- estimateDispersionsTRT( exonCountSet, > nCores = opt$nCores ) > } else{ > exonCountSet <- estimateDispersions(exonCountSet, nCores = > opt$nCores) > } > print('... Done') > > print('Fitting dispersions ...') > exonCountSet <- fitDispersionFunction( exonCountSet ) > print('... Done') > > print('Testing for differential exon usage ...') > if ( opt$trt ){ > print('Using the TRT functions ...') > exonCountSet <- testForDEUTRT( exonCountSet, nCores = > opt$nCores ) > } else{ > exonCountSet <- testForDEU( exonCountSet, nCores = > opt$nCores ) > } > print('... Done') > > exonCountSet <- estimatelog2FoldChanges( exonCountSet ) > ---------------------- > > Any idea what is happening here? Does it have something to do with > exonIDs actually? Because I don't actually have proper intronIDs > so what I am actually doing is using the genomic coordinates as > the intronID. So an intronID will look like: > > chr1:861181-861301 > > I wondering if this is causing a problem. Any help would be > greatly appreciated. I am using DEXSeq 1.5.6 the version that I > got from the svn repository so that I have access to the TRT > functions. > > Fong > > > > > On Wed, Feb 6, 2013 at 1:06 AM, Alejandro Reyes > <alejandro.reyes@embl.de <mailto:alejandro.reyes@embl.de=""> > <mailto:alejandro.reyes@embl.de> <mailto:alejandro.reyes@embl.de>>> wrote: > > Dear Fong Chun Chan, > > Recently I have tried this, and works nicely you just need to > count the reads falling in the introns and them as "exonic > bins" > in DEXSeq. > > However, I think is recommendable to have strand specific > data for > this, because sometimes intronic regions contain antisense > transcripts that if they are differentially expressed > between your > conditions, they can look like intron retention differences in > your transcripts. Also, when adding the introns as "exonic > parts" > in DEXSeq, the models become big and difficult to compute, > so it > is necessary to use DEXSeq in the TRT context > (estimateDispersionsTRT and testForDEUTRT). > > Best wishes, > Alejandro Reyes > > > > Hi, > > I am interested in trying to detect the intron > retention in > RNA-Seq I was > wondering if anyone how ever tried to apply DEXSeq to > looking for > differential intron usage between two groups of > samples. It > seems like an > ideal software that could detect for this given that the > features are just > introns now instead of exons. Or does anyone else > recommend > any other > Bioconductor package that looks for differential > intron retention? > > Thanks, > > Fong > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org <mailto:bioconductor@r-project.org> > <mailto:bioconductor@r-project.org> <mailto:bioconductor@r-project.org>> > > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Alejandro, You are right in that there are a lot of introns which have 0 counts in most of the samples (as expected since you wouldn't expect a lot of intronic expression. I am not sure what you mean by: "...leave the exons inside the ExonCountSet object during all the DEXSeq analysis" Do you actually mean remove these exons/introns that have 0 counts from the ExonCountSet? Because I am currently leaving all the exons/introns in the ExonCountSet. Fong On Mon, Feb 11, 2013 at 2:09 AM, Alejandro Reyes <alejandro.reyes@embl.de>wrote: > Dear Fong Chun Chan, > > I think it is very likely that a lot of your introns have 0 or low counts > in most of the samples, therefore the fitting of the dispersions is having > some problems (consequently also the variance stabilizing function for the > fold changes). I would recommend to leave the exons inside the ExonCountSet > object during all the DEXSeq analysis. > > Best regards, > Alejandro > > > > Hi Alejandro, > > Thanks for your help. That solved the problem. I noticed a strange > results when running DEXSeq using intronic features. All my log2FoldChange > values are NaN. I am getting the following warning() messages: > > Warning messages: > 1: In glmgam.fit(mm, disps[good], coef.start = coefs) : > Too much damping - convergence tolerance not achievable > 2: In fitDispersionFunction(exonCountSet) : > Negative intercept value in the dispersion function, it will be set to > 0. Check fit diagnostics plot section from the vignette. > > I checked the fit diagnostic plot section from the vignette, but I am > not sure as to what could be problem. If it means anything, I was testing > the code on a subset of the data which was 2000 introns. There are no NAs > in my exonCountSet. Any ideas as to what is happening? > > Fong > > > On Wed, Feb 6, 2013 at 2:39 PM, Alejandro Reyes <alejandro.reyes@embl.de>wrote: > >> Dear Fong Chun Chan, >> >> Yes the IDs from the introns are causing some problems in the functions. >> >> It is because DEXSeq uses feature names from the exonCountSet object the >> concatenation of the gene ID and the exon ID separated by a ":" character, >> so introducing additional ":" causes some parsing errors. Could you >> replace the ":" in your exon IDs for another character? >> I think that should solve this problem! >> >> Alejandro >> >> >> >> Thanks for the reply. I've tried to use DEXSeq for intronic regions just >> like how I did it with exons and I am running in this the problem when I am >> trying to estimatelog2FoldChanges. This is the error: >> >> exonCountSet <- estimatelog2FoldChanges( exonCountSet ) >> Error in do.call(`[[<-`, c(quote(coefIndices), as.list(lvlTbl[i, ]), >> coefNames[i])) : >> [[ ]] subscript out of bounds >> >> My code to get to this point is (ignore the fact that the variables are >> called exon. I am just copying the code from my exonic DEXSeq run): >> ----------------------------- >> >> print('Building ExonCountSet ...') >> exonCountSet <- newExonCountSet( countData = sumExonsDf[selectedExons, >> samples], design = designMat, geneIDs = sumExonsDf[selectedExons, >> 'geneID'], exonID = sumExonsDf[selectedExons, 3], exonIntervals = >> exonAnnotDf[selectedExons, c('chr', 'start', 'end', 'strand')] ) >> print('... Done') >> >> print('Estimating size factors ...') >> exonCountSet <- estimateSizeFactors(exonCountSet) >> print('... Done') >> >> print('Estimating dispersions ...') >> if ( opt$trt ){ >> print('Using the TRT functions ...') >> exonCountSet <- estimateDispersionsTRT( exonCountSet, nCores = >> opt$nCores ) >> } else{ >> exonCountSet <- estimateDispersions(exonCountSet, nCores = >> opt$nCores) >> } >> print('... Done') >> >> print('Fitting dispersions ...') >> exonCountSet <- fitDispersionFunction( exonCountSet ) >> print('... Done') >> >> print('Testing for differential exon usage ...') >> if ( opt$trt ){ >> print('Using the TRT functions ...') >> exonCountSet <- testForDEUTRT( exonCountSet, nCores = opt$nCores ) >> } else{ >> exonCountSet <- testForDEU( exonCountSet, nCores = opt$nCores ) >> } >> print('... Done') >> >> exonCountSet <- estimatelog2FoldChanges( exonCountSet ) >> ---------------------- >> >> Any idea what is happening here? Does it have something to do with >> exonIDs actually? Because I don't actually have proper intronIDs so what I >> am actually doing is using the genomic coordinates as the intronID. So an >> intronID will look like: >> >> chr1:861181-861301 >> >> I wondering if this is causing a problem. Any help would be greatly >> appreciated. I am using DEXSeq 1.5.6 the version that I got from the svn >> repository so that I have access to the TRT functions. >> >> Fong >> >> >> >>> >>> On Wed, Feb 6, 2013 at 1:06 AM, Alejandro Reyes <alejandro.reyes@embl.de<mailto:>>> alejandro.reyes@embl.de>> wrote: >>> >>> Dear Fong Chun Chan, >>> >>> Recently I have tried this, and works nicely you just need to >>> count the reads falling in the introns and them as "exonic bins" >>> in DEXSeq. >>> >>> However, I think is recommendable to have strand specific data for >>> this, because sometimes intronic regions contain antisense >>> transcripts that if they are differentially expressed between your >>> conditions, they can look like intron retention differences in >>> your transcripts. Also, when adding the introns as "exonic parts" >>> in DEXSeq, the models become big and difficult to compute, so it >>> is necessary to use DEXSeq in the TRT context >>> (estimateDispersionsTRT and testForDEUTRT). >>> >>> Best wishes, >>> Alejandro Reyes >>> >>> >>> >>> Hi, >>> >>> I am interested in trying to detect the intron retention in >>> RNA-Seq I was >>> wondering if anyone how ever tried to apply DEXSeq to looking for >>> differential intron usage between two groups of samples. It >>> seems like an >>> ideal software that could detect for this given that the >>> features are just >>> introns now instead of exons. Or does anyone else recommend >>> any other >>> Bioconductor package that looks for differential intron >>> retention? >>> >>> Thanks, >>> >>> Fong >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org <mailto:bioconductor@r-project.org> >>> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >>> >>> >> > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Dear Fong Chun Chan, Sorry for not being clear. What I meant is that your ExonCountSet object should contain all your exons and the introns that you want to test for intron retention. This will allow you to have enough read counts (mainly coming from the exons) to fit the dispersion function. Best regards, Alejandro > Hi Alejandro, > > You are right in that there are a lot of introns which have 0 counts > in most of the samples (as expected since you wouldn't expect a lot of > intronic expression. I am not sure what you mean by: > > "...leave the exons inside the ExonCountSet object during all the > DEXSeq analysis" > > Do you actually mean remove these exons/introns that have 0 counts > from the ExonCountSet? Because I am currently leaving all the > exons/introns in the ExonCountSet. > > Fong > > > > > On Mon, Feb 11, 2013 at 2:09 AM, Alejandro Reyes > <alejandro.reyes@embl.de <mailto:alejandro.reyes@embl.de="">> wrote: > > Dear Fong Chun Chan, > > I think it is very likely that a lot of your introns have 0 or low > counts in most of the samples, therefore the fitting of the > dispersions is having some problems (consequently also the > variance stabilizing function for the fold changes). I would > recommend to leave the exons inside the ExonCountSet object during > all the DEXSeq analysis. > > Best regards, > Alejandro > > > >> Hi Alejandro, >> >> Thanks for your help. That solved the problem. I noticed a >> strange results when running DEXSeq using intronic features. All >> my log2FoldChange values are NaN. I am getting the following >> warning() messages: >> >> Warning messages: >> 1: In glmgam.fit(mm, disps[good], coef.start = coefs) : >> Too much damping - convergence tolerance not achievable >> 2: In fitDispersionFunction(exonCountSet) : >> Negative intercept value in the dispersion function, it will be >> set to 0. Check fit diagnostics plot section from the vignette. >> >> I checked the fit diagnostic plot section from the vignette, but >> I am not sure as to what could be problem. If it means anything, >> I was testing the code on a subset of the data which was 2000 >> introns. There are no NAs in my exonCountSet. Any ideas as to >> what is happening? >> >> Fong >> >> >> On Wed, Feb 6, 2013 at 2:39 PM, Alejandro Reyes >> <alejandro.reyes@embl.de <mailto:alejandro.reyes@embl.de="">> wrote: >> >> Dear Fong Chun Chan, >> >> Yes the IDs from the introns are causing some problems in the >> functions. >> >> It is because DEXSeq uses feature names from the exonCountSet >> object the concatenation of the gene ID and the exon ID >> separated by a ":" character, so introducing additional ":" >> causes some parsing errors. Could you replace the ":" in >> your exon IDs for another character? >> I think that should solve this problem! >> >> Alejandro >> >> >> >> Thanks for the reply. I've tried to use DEXSeq for intronic >> regions just like how I did it with exons and I am running in >> this the problem when I am trying to estimatelog2FoldChanges. >> This is the error: >> >> exonCountSet <- estimatelog2FoldChanges( exonCountSet ) >> Error in do.call(`[[<-`, c(quote(coefIndices), >> as.list(lvlTbl[i, ]), coefNames[i])) : >> [[ ]] subscript out of bounds >> >> My code to get to this point is (ignore the fact that the >> variables are called exon. I am just copying the code from my >> exonic DEXSeq run): >> ----------------------------- >> >> print('Building ExonCountSet ...') >> exonCountSet <- newExonCountSet( countData = >> sumExonsDf[selectedExons, samples], design = designMat, >> geneIDs = sumExonsDf[selectedExons, 'geneID'], exonID = >> sumExonsDf[selectedExons, 3], exonIntervals = >> exonAnnotDf[selectedExons, c('chr', 'start', 'end', 'strand')] ) >> print('... Done') >> >> print('Estimating size factors ...') >> exonCountSet <- estimateSizeFactors(exonCountSet) >> print('... Done') >> >> print('Estimating dispersions ...') >> if ( opt$trt ){ >> print('Using the TRT functions ...') >> exonCountSet <- estimateDispersionsTRT( exonCountSet, >> nCores = opt$nCores ) >> } else{ >> exonCountSet <- estimateDispersions(exonCountSet, >> nCores = opt$nCores) >> } >> print('... Done') >> >> print('Fitting dispersions ...') >> exonCountSet <- fitDispersionFunction( exonCountSet ) >> print('... Done') >> >> print('Testing for differential exon usage ...') >> if ( opt$trt ){ >> print('Using the TRT functions ...') >> exonCountSet <- testForDEUTRT( exonCountSet, nCores = >> opt$nCores ) >> } else{ >> exonCountSet <- testForDEU( exonCountSet, nCores = >> opt$nCores ) >> } >> print('... Done') >> >> exonCountSet <- estimatelog2FoldChanges( exonCountSet ) >> ---------------------- >> >> Any idea what is happening here? Does it have something to do >> with exonIDs actually? Because I don't actually have proper >> intronIDs so what I am actually doing is using the genomic >> coordinates as the intronID. So an intronID will look like: >> >> chr1:861181-861301 >> >> I wondering if this is causing a problem. Any help would be >> greatly appreciated. I am using DEXSeq 1.5.6 the version that >> I got from the svn repository so that I have access to the >> TRT functions. >> >> Fong >> >> >> >> >> On Wed, Feb 6, 2013 at 1:06 AM, Alejandro Reyes >> <alejandro.reyes@embl.de <mailto:alejandro.reyes@embl.de=""> >> <mailto:alejandro.reyes@embl.de>> <mailto:alejandro.reyes@embl.de>>> wrote: >> >> Dear Fong Chun Chan, >> >> Recently I have tried this, and works nicely you just >> need to >> count the reads falling in the introns and them as >> "exonic bins" >> in DEXSeq. >> >> However, I think is recommendable to have strand >> specific data for >> this, because sometimes intronic regions contain >> antisense >> transcripts that if they are differentially expressed >> between your >> conditions, they can look like intron retention >> differences in >> your transcripts. Also, when adding the introns as >> "exonic parts" >> in DEXSeq, the models become big and difficult to >> compute, so it >> is necessary to use DEXSeq in the TRT context >> (estimateDispersionsTRT and testForDEUTRT). >> >> Best wishes, >> Alejandro Reyes >> >> >> >> Hi, >> >> I am interested in trying to detect the intron >> retention in >> RNA-Seq I was >> wondering if anyone how ever tried to apply >> DEXSeq to looking for >> differential intron usage between two groups of >> samples. It >> seems like an >> ideal software that could detect for this given >> that the >> features are just >> introns now instead of exons. Or does anyone else >> recommend >> any other >> Bioconductor package that looks for differential >> intron retention? >> >> Thanks, >> >> Fong >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> <mailto:bioconductor@r-project.org> >> <mailto:bioconductor@r-project.org>> <mailto:bioconductor@r-project.org>> >> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> >> >> > > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 833 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6