Hi Mallon,
I think what you want is this:
formulaFullModel = ? sample + exon + strain:exon + colony:exon +
strain:colony:exon
formulaReducedModel = ? sample + exon + strain:exon + colony:exon
Don't forget to specify "formulaFullModel" when you create your
DEXSeqDataSet object.
Or if you already created your object you have to modify your design
by
doing before calling
estimateDispersions
design( dxd ) <- formulaFullModel
Best regards,
Alejandro
> Hi Alejandro,
> Thanks for the quick reply. That appears to be working, but its
going to
> take me a while to figure out my interaction model for the new
syntax.
> Simon Anders helped me originally and I include it below as an FYI
>
> Thanks for the help
>
> Eamonn
>
> On 06/06/13 16:22, Mallon, Eamonn B. (Dr.) wrote:
>> Dear all
>> I have 11 samples for a cross infection experiment where there are
two
>> colonies (the hosts K or Q) and 2 strains (6 or 8).
>>
>> sample colony strain type
>> K61 k six single-read
>> K62 k six single-read
>> K63 k six single-read
>> K81 k eight single-read
>> K82 k eight single-read
>> K83 k eight single-read
>> Q61 q six single-read
>> Q62 q six single-read
>> Q63 q six single-read
>> Q81 q eight single-read
>> Q82 q eight single-read
>>
>> I am most interested in finding exon usage differences related to
the
>> interaction of the colony and strain factors. Following the
vignette I
>> put the following code together
>>
>>
>> formuladispersion<-count~sample+(colony:strain)+exon
>> ecs<-estimateDispersions(ecs, formula=formuladispersion)
>> ecs<-fitDispersionFunction(ecs)
>>
>> formula0<-count~sample+exon+(colony:strain)
>> formula1<-count~sample+exon+(colony:strain)*I(exon==exonID)
>> ecs<-testForDEU(ecs, formula0=formula0, formula1=formula1)
>>
>> Does this make sense?
> Not quite. This tests for main effects and interaction in one go, so
it
> returns as hits all exons whose usage differs between colonies
_and/or_
> between strains. You are looking for interactions, i.e., for exons
whose
> usage is different between strains _and_ where this difference
itself
> differs between colonies. So, you need
>
> formuladispersion<-count~sample+(colony*strain)*exon
>
> formula0<-count~sample+exon+(colony+strain)*exon
> formula1<-count~sample+exon+(colony+strain)*exon+(colony:strain)*I(e
xon==ex
> onID)
>
> Things look a bit simpler if you use the new "TRT" functions:
>
> formuladispersion<-count~sample+(colony*strain)*exon
>
> formula0<-count~sample+exon+(colony+strain)*exon
> formula1<-count~sample+exon+(colony*strain)*exon
>
> Simon
>
>
> Dr Eamonn Mallon
> Lecturer in Evolutionary Biology
> Adrian 220
> Biology Department
> University of Leicester
>
>
http://www2.le.ac.uk/departments/biology/people/mallon
>
>
>
>
>
> On 22/05/2014 12:13, "Alejandro Reyes" <alejandro.reyes at="" embl.de="">
wrote:
>
>> Dear Eamonn Mallon,
>>
>> Thanks for reporting this!
>>
>> There was an error in the DEXSeq code, that was not considering the
>> different ways to specify when the strand is unknown. The gtf
format
>> uses a ".", but the GRanges object uses "*" and does not accept a
"."
>> instead. I have added a fix to convert the "." to "*" when reading
the
>> data from a gtf file. So this should work with the newests versions
in
>> the svn. (Let me know if it doesn't!)
>>
>> Best regards,
>> Alejandro
>>
>>> Hi Alejandro,
>>> I was having a similar problem. I finished the analysis earlier in
the
>>> year (on an earlier version of DEXSeq) but now would like to draw
some
>>> of
>>> the pictures separately. I tried to run the analysis again. I?ve
changed
>>> to DEXSeqDataSeqFromHTSeq and now have the error
>>>
>>> Error in .local(x, ...) : strand values must be in '+' '-' ?*'
>>>
>>>
>>> I guess this means something is now wrong with my GFF file.
>>>
>>> Its first line looks like
>>>
>>>
>>> gi|313870964|gb|AELG01010669.1| dexseq_prepare_annotation.py
aggregate_ge
>>> ne
>>> 2 1088 . . . gene_id "XLOC_000001"
>>>
>>>
>>>
>>>
>>> Any help would be appreciated.
>>>
>>> Eamonn
>>>
>>>
>>> Dr Eamonn Mallon
>>> Lecturer in Evolutionary Biology
>>> Adrian 220
>>> Biology Department
>>> University of Leicester
>>>
>>>
http://www2.le.ac.uk/departments/biology/people/mallon
>>>
>>>
>>>
>>>
>>>
>>> On 22/05/2014 10:53, "Alejandro Reyes" <alejandro.reyes at="" embl.de=""> wrote:
>>>
>>>> Hi Roberta,
>>>>
>>>> Yes, the newer versions of DEXSeq have lots of updates, in the
new
>>>> vignette
>>>> you will the description of the new functions. As the error
message
>>>> indicates,
>>>> some functions were deprecated and substituted for new ones,
including
>>>> the
>>>> functions that created the objects. In your case, you need to
use
>>>> DEXSeqDataSeqFromHTSeq instead of read.HTSeqCounts.
>>>>
>>>> Let me know if it works!
>>>> Best regards,
>>>> Alejandro
>>>>
>>>> ps. I think we might have more efficient/effective communication
if we
>>>> keep the e-mails concerning
>>>> the same issue in the same thread, rather than double posting and
>>>> having
>>>> many parallel
>>>> conversations: I have cc-ed Simon in this e-mail :)
>>>>
>>>>
>>>>
>>>>> Hi Alejandro,
>>>>> I updated to the current release versione of DEXSeq as you can
see.
>>>>>
>>>>> sessionInfo()
>>>>> R version 3.1.0 (2014-04-10)
>>>>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>>>>
>>>>> locale:
>>>>> [1]
it_IT.UTF-8/it_IT.UTF-8/it_IT.UTF-8/C/it_IT.UTF-8/it_IT.UTF-8
>>>>>
>>>>> attached base packages:
>>>>> [1] parallel stats graphics grDevices utils datasets
>>>>> methods base
>>>>>
>>>>> other attached packages:
>>>>> [1] DEXSeq_1.10.3 BiocParallel_0.6.0 DESeq2_1.4.5
>>>>> RcppArmadillo_0.4.300.0 Rcpp_0.11.1
>>>>> GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.6
>>>>> Biobase_2.24.0
>>>>> [10] BiocGenerics_0.10.0 BiocInstaller_1.14.2
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>> [1] annotate_1.42.0 AnnotationDbi_1.26.0 BatchJobs_1.2
>>>>> BBmisc_1.6 biomaRt_2.20.0 Biostrings_2.32.0
>>>>> bitops_1.0-6 brew_1.0-6 codetools_0.2-8
>>>>> DBI_0.2-7 digest_0.6.4
>>>>> [12] fail_1.2 foreach_1.4.2 genefilter_1.46.1
>>>>> geneplotter_1.42.0 grid_3.1.0 hwriter_1.3
>>>>> iterators_1.0.7 lattice_0.20-29 locfit_1.5-9.1
>>>>> plyr_1.8.1 RColorBrewer_1.0-5
>>>>> [23] RCurl_1.95-4.1 Rsamtools_1.16.0 RSQLite_0.11.4
>>>>> sendmailR_1.1-2 splines_3.1.0 statmod_1.4.19
>>>>> stats4_3.1.0 stringr_0.6.2 survival_2.37-7
>>>>> tools_3.1.0 XML_3.98-1.1
>>>>> [34] xtable_1.7-3 XVector_0.4.0 zlibbioc_1.10.0
>>>>>
>>>>> Now I have another error message in the previous steps:
>>>>>
>>>>> ecs <- read.HTSeqCounts (sampleTable$countFile, sampleTable ,
>>>>> "genes.gff")
>>>>> Errore in checkAtAssignment("character", "annotationFile",
>>>>> "character")
>>>>> :
>>>>> ?annotationFile? is not a slot in class ?character?
>>>>> Inoltre: Warning message:
>>>>> 'newExonCountSet' is deprecated.
>>>>> Use 'DEXSeqDataSet' instead.
>>>>> See help("Deprecated")
>>>>>
>>>>> How can I solve it?
>>>>>
>>>>> Thank you
>>>>>
>>>>>
>>>>> Alejandro Reyes <alejandro.reyes at="" embl.de=""> ha scritto:
>>>>>
>>>>>> Dear Carriero,
>>>>>>
>>>>>> I think you forgot to copy the output of your sessionInfo()?
>>>>>>
>>>>>> I think you might be using a very old version of DEXSeq. If so,
could
>>>>>> you please update at least to the current release version of
DEXSeq
>>>>>> (1.10.3), try again and write back if it keeps giving you error
>>>>>> messages?
>>>>>>
>>>>>> Best regards,
>>>>>> Alejandro
>>>>>>
>>>>>>> I have a warning message when I estimate the dipersion
parameter. I
>>>>>>> can't carry on because of some other error messages in the
following
>>>>>>> steps, as I attached below.
>>>>>>> Thanks in advance
>>>>>>>
>>>>>>> -- output of sessionInfo():
>>>>>>>
>>>>>>> sizeFactors (ecs)
>>>>>>> 7A31.counts 7A32.counts 7A33.counts 46BR1.counts
46BR2.counts
>>>>>>> 46BR3.counts
>>>>>>> 1.0177858 1.0753635 1.1738656 0.9085247
0.9827212
>>>>>>> 0.9198726
>>>>>>>> ecs<- estimateDispersions ( ecs )
>>>>>>> Dispersion estimation. (Progress report: one dot per 100
genes)
>>>>>>>
>>>>>>>
>>>>>>>
.....................................................................
>>>>>>> ..
>>>>>>> ............................................................
>>>>>>>
>>>>>>> Warning messages:
>>>>>>> 1: In .local(object, ...) :
>>>>>>> Exons with less than 11 counts will be discarded. For more
>>>>>>> details
>>>>>>> read the documentation, parameter minCount
>>>>>>> 2: In .local(object, ...) :
>>>>>>> Genes with more than 70 testable exons will be kicked out
of the
>>>>>>> analysis. For more details read the documentation, parameter
maxExon
>>>>>>>> ecs <- fitDispersionFunction(ecs)
>>>>>>> Error in if (sum(log(coefs/oldcoefs)^2) < 0.005) break :
>>>>>>> missing value where TRUE/FALSE needed
>>>>>>> In addition: Warning messages:
>>>>>>> 1: In glmgam.fit(mm, disps[good], start = coefs) :
>>>>>>> Too much damping - convergence tolerance not achievable
>>>>>>> 2: In log(coefs/oldcoefs) : NaNs produced
>>>>>>>> head( fData (ecs)$dispBeforeSharing)
>>>>>>> [1] 0.000000e+00 6.003931e-03 4.852941e-03 0.000000e+00
2.771527e-10
>>>>>>> [6] 0.000000e+00
>>>>>>>> ecs at dispFitCoefs
>>>>>>> [1] NA NA
>>>>>>>> head(fData(ecs)$dispFitted)
>>>>>>> [1] NA NA NA NA NA NA
>>>>>>>> plotDispEsts(ecs)
>>>>>>> Error: could not find function "plotDispEsts"
>>>>>>>> ecs<-testForDEU(ecs)
>>>>>>> Error in testForDEU(ecs) :
>>>>>>> No dispersion values found, call function
fitDispersionFunction
>>>>>>> first.
>>>>>>>
>>>>>>> --
>>>>>>> Sent via the guest posting facility at bioconductor.org.
>>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>>
https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>>
http://news.gmane.org/gmane.science.biology.informatics.conductor