Entering edit mode
Hej Sylvain!
I just brought it back to the list so that everyone benefits from the
answer :-) ; see inline.
---------------------------------------------------------------
Nicolas Delhomme
Nathaniel Street Lab
Department of Plant Physiology
Ume? Plant Science Center
Tel: +46 90 786 5478
Email: nicolas.delhomme at plantphys.umu.se
SLU - Ume? universitet
Ume? S-901 87 Sweden
---------------------------------------------------------------
On 27 Mar 2014, at 15:24, Sylvain Foisy Ph. D. <sylvain.foisy at="" diploide.net=""> wrote:
> Hi Nicolas,
>
> Thanks for the input ;-) I am running this:
>
> [1] easyRNASeq_1.9.5
OK the latest is 1.9.7 but that?s not going to solve your problem yet?
hopefully by the end of the week, I?ll have fixed the simpleRNASeq
function.
>
> I actually started to use the R-devel branch to try to see if I
could overcome a problem I am having with the current version (1.8.6).
Actually 1.8.7 is the latest and fixes a bug that addresses your last
question below.
> I realized that my first attempts at DE analysis were using tophat2
alignments unfiltered for badly mapped reads.
That?s a good idea since easyRNASeq was not paying attention to that
previously apart from mentioning it in the vignette. The new function
which you tried is meant to fix this among other things.
> I therefore used samtools to keep only the properly paired reads (-f
2) and tried to run easyRNASeq again on the new BAM files. Whenever I
used ?exons" or "transcripts" as counts, it would work ok but I would
get lots (about half of my BAM files) of these:
>
> 20: In fetchCoverage(rnaSeq, format = format, filename = filename,
... :
> You enforce UCSC chromosome conventions, however the provided
alignments are not compliant. Correcting it.
> 21: In fetchCoverage(rnaSeq, format = format, filename = filename,
... :
> The read length stored in the object (probably provided as
argument): 100
> is not the same as the ones: 100, 99, 98, 97, 96, 95, 94, 93, 92,
91, 90, 89, 88, 87, 86, 85, 84, 83, 82, 81, 80, 79, 78, 77, 76, 75,
74, 73, 72, 71, 7
> 0, 69, 68, 67, 66, 65, 64, 63, 62, 61, 60, 59, 58, 57, 56, 55, 54,
53, 52, 51, 50 determined from the file: /shares/new-
data/htseq_datastore/RNASeq_dat
> aster/123456_CELLTYPE_accepted_hits_properly_paired.bam
>
> For the UCSC warning, I do believe that it stands from the fact that
I am using the genes.gtf file from Illumina's iGenome archive for the
NCBI annotation source.
Yes. This is also a remnant of early coding, when RNA-Seq analyses
were not so standardised and that?s also going to disappear in the new
function.
> The second one leaves me clueless :-(
Again, a remnant of the time when aligner could not deal with variable
length reads. easyRNASeq used to check for the read size as a sanity
check, which he does not anymore obviously, but the warning stayed. I
have to admit that I simply got used to see that warning but could
have removed it eons ago. It will be gone in the next function also.
>
> if I tried using "genes" and summeraization=?geneModels?, i would
get these but I am also unable to write the count data to file,
complaining that the count.table object cannot be written to file
using the write.table function?
Try 1.8.7 then. And instead of the genes / geneModels paradigm, have a
look at the section 7.1 of the vignettes for a more accurate (IMO) and
faster approach at looking for geneModels (what I now call "synthetic
transcripts? rather than "gene models" as that last one was confusing
because of its many meanings, e.g. if you think of a genic locus in an
assembly project.
> All of these were not observed on the unfiltered data, which flags
me as something that samtools did?
This looks more like some quality-based read trimming was done on the
data rather than having anything to do with samtools. Anyway, using
1.8.7 should help getting your count.table written out properly. Let
me know if not.
Cheers,
Nico
>
> Any ideas?
>
> Best regards
>
> S
>
> On Mar 27, 2014, at 10:01 AM, Nicolas Delhomme <delhomme at="" embl.de="">
wrote:
>
>> Hej Sylvain!
>>
>> Indeed that function is under very active development and I?ve
broken it times and again :-)
>>
>> I would suggest you to keep using the easyRNASeq function until the
next Bioc release (planned in ~10 days). I?m rushing to get the
simpleRNASeq in the next release as bug-free as possible and I expect
it to be fairly unstable by then.
>>
>> Anyway, can you tell me which easyRNASeq version you are using (run
sessionInfo() or package.version(?easyRNASeq?) in your R session)?
Just so that I make sure I?ve got that bug squashed already.
>>
>> Thanks,
>>
>> Nico
>>
>> ---------------------------------------------------------------
>> Nicolas Delhomme
>>
>> Genome Biology Computational Support
>>
>> European Molecular Biology Laboratory
>>
>> Tel: +49 6221 387 8310
>> Email: nicolas.delhomme at embl.de
>> Meyerhofstrasse 1 - Postfach 10.2209
>> 69102 Heidelberg, Germany
>> ---------------------------------------------------------------
>>
>>
>>
>>
>>
>> On 27 Mar 2014, at 14:40, Sylvain Foisy Ph. D. <sylvain.foisy at="" diploide.net=""> wrote:
>>
>>> Hi to all,
>>>
>>> I am trying to use the R-devel branch for easyRNASeq to get gene
counts from filtered BAM files and I get the following error:
>>>
>>>> exp<-simpleRNASeq(bamFiles = bfl, param = rParam, nnodes = 1,
verbose = TRUE)
>>> Error in (function (classes, fdef, mtable) :
>>> unable to find an inherited method for function ?simpleRNASeq? for
signature ?"BamFileList??
>>>
>>> According to the help material, simpleRNASeq should need this:
>>>
>>> simpleRNASeq(bamFiles = BamFileList(),
>>> param = RnaSeqParam(), nnodes = 1, verbose = FALSE)
>>>
>>> When I check the nature of my bfl object, I get this:
>>>
>>>> bfl
>>> BamFileList of length 103
>>> names(103): 123456_CELLTYPE_B_accepted_hits_properly_paired.bam ?
>>>
>>> So bfl is the right type? Any ideas or is this a bug? It?s the
devel version after all ;-)
>>>
>>> Best regards
>>>
>>> S
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>> Email secured by Check Point
>